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(57) Abstract 

The growth hormone supergene family comprises greater than 20 structurally related cytokines and growth factors. A general method 
is provided for creating site-specific, biologically active conjugates of these proteins. The method involves adding cysteine residues to 
non-essential regions of the proteins or substituting cysteine residues for non-essential amino acids in the proteins using site-directed 
mutagenesis and then covalently coupling a cysteine-reactive polymer or other type of cysteine-^eactive moiety to the proteins via the 
added cysteine residue. Disclosed herein are preferred sites for adding cysteine residues or introducing cysteine substitutions into the 
proteins, and the proteins and protein derivatives produced thereby. 



FOR THE PURPOSES OF INFORMATION ONLY 
Codes used to identify States party to the PCT on the front pages of pamphlets publishing international applications under the PCT. 



AL 


Albania 


ES 


Spain 


LS 


Lesotho 


SI 


Slovenia 


AM 


Armenia 


FI 


Finland 


LT 


Lithuania 


SK 


Slovakia 


AT 


Austria. 


FR 


France 


LU 


Luxembourg 


SN 


Senegal 


AU 


Australia 


GA 


Gabon 


LV 


Latvia 


sz 


Swaziland 


AZ 


Azerbaijan 


GB 


United Kingdom 


MC 


Monaco 


TD 


Chad 


BA 


Bosnia and Herzegovina 


GE 


Georgia 


MD 


Republic of Moldova 


TG 


Togo 


BB 


Barbados 


GH 


Ghana 


MG 


Madagascar 


TJ 


Tajikistan 


BE 


Belgium 


GN 


Guinea 


MK 


The former Yugoslav 


TM 


Turkmenistan 


BF 


Burkina Paso 


GR 


Greece 




Republic of Macedonia 


TR 


Turkey 


BG 


Bulgaria 


HU 


Hungary 


ML 


Mali 


TT 


Trinidad and Tobago 


Bj 


Benin 


IE 


Ireland 


MN 


Mongolia 


UA 


Ukraine 


BR 


Brazil 


IL 


Israel 


MR 


Mauritania 


UG 


Uganda 


BY 


Belarus 


15 


Iceland 


MW 


Malawi 


US 


United States of America 


CA 


Canada 


IT 


Italy 


MX 


Mexico 


uz 


Uzbekistan 


CF 


Central African Republic 


JP 


Japan 


NE 


Niger 


VN 


Viet Nam 


CG 


Congo 


KE 


Kenya 


NL 


Netherlands 


YU 


Yugoslavia 


CH 


Switzerland 


KG 


Kyrgyzstan 


NO 


Norway 


ZW 


Zimbabwe 


a 


Cote d'lvoire 


KP 


Democratic People's 


NZ 


New Zealand 






CM 


Cameroon 




Republic of Korea 


PL 


Poland 






CN 


China 


KR 


Republic of Korea 


PT 


Portugal 






CU 


Cuba 


KZ 


Kazakstan 


RO 


Romania 






cz 


Czech Republic 


LC 


Saint tuda 


RU 


Russian Federation 






DE 


Germany 


U 


Liechtenstein 


SD 


Sudan 






DK 


Denmark 


LK 


Sri Lanka 


SB 


Sweden 






EE 


Estonia 


LR 


Liberia 


S 


Singapore 







WO 99/03887 



PCT/US98/14497 



DERIVATIVES OF GROWTH HORMONE AND RELATED PROTEINS 

Field of the Invention 

The present invention relates to genetically engineered therapeutic proteins. More 
5 specifically, the engineered proteins include growth hormone and related proteins. 

Background of the Invention 

The following proteins are encoded by genes of the growth hormone (GH) supergene 
family (Bazan (1990); Mott and Campbell (1995); Silvennoinen and Ihle (1996)): growth 

10 hormone, prolactin, placental lactogen, erythropoietin (EPO), thrombopoietin (TPO), 

interleukin-2 (IL-2), IL-3, IL-4, IL-5, IL-6, IL-7, IL-9, IL-10, IL-1 1, IL-12 (p35 subunit), IL- 
13, IL-15, oncostatin M, ciliary neurotrophic factor, leukemia inhibitory factor, alpha 
interferon, beta interferon, gamma interferon, omega interferon, tau interferon, granulocyte- 
colony stimulating factor (G-CSF), granulocyte-macrophage colony stimulating factor (GM- 

15 CSF), macrophage colony stimulating factor (M-CSF) and cardiotrophin-1 (CT-1) ("the GH 
supergene family"). It is anticipated that additional members of this gene family will be 
identified in the future through gene cloning and sequencing. Members of the GH supergene 
family have similar secondary and tertiary structures, despite the fact that they generally have 
limited amino acid or DNA sequence identity. The shared structural features allow new 

20 members of the gene family to be readily identified. 

There is considerable interest on the part of patients and healthcare providers in the 
development of long acting, "user-friendly" protein therapeutics. Proteins are expensive to 
manufacture and, unlike conventional small molecule drugs, are not readily absorbed by the 
body. Moreover, they are digested if taken orally. Therefore, natural proteins must be 

25 administered by injection. After injection, most proteins are cleared rapidly from the body, 
necessitating frequent, often daily, injections. Patients dislike injections, which leads to 
reduced compliance and reduced drug efficacy. Some proteins, such as erythropoietin (EPO), 
are effective when administered less often (three times per week for EPO) because they are 
glycosylated. However, glycosylated proteins are produced using expensive mammalian cell 

30 expression systems. 

The length of time an injected protein remains in the body is finite and is determined 
by, e.g., the protein's size and whether or not the protein contains covalent modifications such 
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as glycosylation. Circulating concentrations of injected proteins change constantly, often by 
several orders of magnitude, over a 24-hour period. Rapidly changing concentrations of 
protein agonists can have dramatic downstream consequences, at times under-stimulating and 
at other times over-stimulating target cells. Similar problems plague protein antagonists. 
5 These fluctuations can lead to decreased efficacy and increased frequency of adverse side 
effects for protein therapeutics. The rapid clearance of recombinant proteins from the body 
significantly increases the amount of protein required per patient and dramatically increases 
the cost of treatment The cost of human protein pharmaceuticals is expected to increase 
dramatically in the years ahead as new and existing drugs are approved for more disease 
10 indications. 

Thus, there is a need to develop protein delivery technologies that lower the costs of 
protein therapeutics to patients and healthcare providers. The present invention provides a 
solution to this problem by providing methods to prolong the circulating half-lives of protein 
therapeutics in the body so that the proteins do not have to be injected frequently. This 

1 5 solution also satisfies the needs and desires of patients for protein therapeutics that are " user- 
friendly", i.e., protein therapeutics that do not require frequent injections. The present 
invention solves these and other problems by providing biologically active, cysteine-added 
variants of members of the growth hormone supergene family. The invention also provides 
for the chemical modification of these variants with cysteine-reactive polymers or other types 

20 of cysteine-reactive moieties to produce derivatives thereof and the molecules so produced. 

Summary of the Invention 

The present invention provides cysteine variants of members of the GH supergene 
family. The variants comprise a cysteine residue substituted for a nonessential amino acid of 
the proteins. Preferably, the variants comprise a cysteine residue substituted for an amino 
acid selected from amino acids in the loop regions, the ends of the alpha helices, proximal to 
the first amphipathic helix, and distal to the final amphipathic helix or wherein the cysteine 
residue is added at the N-terminus or C-terminus of the proteins. Preferred sites for 
substitution are the N- and O-linked glycosylation sites. 

Also provided are cysteine variants wherein the amino acid substituted for is in the A- 
B loop, B-C loop, the C-D loop or D-E loop of interferon/interferon- 10-like members of the 
GH supergene family. 

2 
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Also provided are cysteine variants of members of the GH supergene family wherein 
the cysteine residue is introduced between two amino acids in the natural protein. In 
particular, the cysteine residue is introduced into the loop regions, the ends of the alpha 
helices, proximal to the first amphipathic helix, or distal to the final amphipathic helix. Even 
5 more particularly, the cysteine variant is introduced between two amino acids in an N-O- 
linked glycosylation site or adjacent to an amino acid in an N-linked or O-linked 
glycosylation site. 

More particularly are provided cysteine variants wherein the loop region where the 
cysteine is introduced is the A-B loop, the B-C loop, the C-D loop or D-E loop of 
10 interferon/interferon-10-like members of the GH supergene family. 

Such cysteine substitutions or insertion mutations also can include the insertion of one 
or more additional amino acids amino acids at the ammo-terminal or carboxy-terminal to the 
cysteine substitution or insertion. 

Also provided are cysteine variants that are further derivatised by PEGylating the 
15 cysteine variants and including the derivatised proteins produced thereby. 

As set forth in the examples, specific cysteine variants of the members of the GH 
supergene family also are provided, including for example, variants of GH. The GH cysteine 
variants can have the substituted-for amino acid or inserted cysteine located at the N-terminal 
end of the A-B loop, the B-C loop, the C-D loop, the first three or last three amino acids in 
20 the A, B, C and D helices and the amino acids proximal to helix A and distal to helix D. 

More particularly, the cysteine can be substituted for the following amino acids: FI, 
T3, P5, E33, A34, K38, E39, K40, S43, Q46, N47, P48, Q49, T50, S51, S55, T60, A98, N99, 
S100, G104, A105, S106, E129, D130, G131, S132, P133, T135, G136, Q137, K140, Q141, 
T142, S144, K145, D147, T148, N149, S150, H151, N152, D153, S184, E186, G187, S188, 
25 and G190. 

Other examples of cysteine variants according to the invention include erythropoietin 
variants. Erythropoietin variants include those wherein the substituted for amino acid is 
located in the A-B loop, the B-C loop, the C-D loop, the amino acids proximal to helix A and 
distal to helix D and the N- or C-terminus. Even more specifically, the EPO cysteine variants 
30 include molecules wherein the amino acids indicated below have a cysteine substituted 
therefor: serine-126, N24, 125, T26, N38, 139, T40, N83, S84, Al, P2, P3, R4, D8, S9, T27, 
G28, A30, E31, H32, S34, N36, D43, T44, K45, N47, A50, K52, E55, G57, Q58, G77, Q78, 
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A79, Q86, W88, E89, T107, R110, Al 11, G113, A114, Q115, K116, El 17, Al 18, S120, 
P121, P122, D123, A124, A125, A127, A128, T132, K154, T157, G158, E159, AI60, T163, 
G164,D165, R166 and S85. 

The members of the GH supergene family include growth hormone, prolactin, 
5 placental lactogen, erythropoietin, thrombopoietin, interleukin-2, interleukin-3, interleukin-4, 
interleukin-5, interleukin-6, interleukin-7, interleukin-9, interleukin-10, interleukin-11, 
interleukin-12 (p35 subunit), interleukin-13, interleukin-15, oncostatin M, ciliary 
neurotrophic factor, leukemia inhibitory factor, alpha interferon, beta interferon, gamma 
interferon, omega interferon, tau interferon, granulocyte-colony stimulating factor, 
10 granulocyte-macrophage colony stimulating factor, macrophage colony stimulating factor, 
cardiotrophin-1 and other proteins identified and classified as members of the family. The 
proteins can be derived from any animal species including human, companion animals and 
farm animals. 

Other variations and modifications to the invention will be obvious to those skilled in 
15 the art based on the specification and the "rules" set forth herein. All of these are considered 
as part of the invention. 

Detailed Description of the Invention 

The present invention relates to cysteine variants and, among other things, the site- 

20 specific conjugation of such proteins with polyethylene glycol (PEG) or other such moieties. 
PEG is a non-antigenic, inert polymer that significantly prolongs the length of time a protein 
circulates in the body. This allows the protein to be effective for a longer period of time. 
Covalent modification of proteins with PEG has proven to be a useful method to extend the 
circulating half-lives of proteins in the body (Abuchowski et al., 1984; Hershfield, 1987; 

25 Meyers et al., 1991). Covalent attachment of PEG to a protein increases the protein's 
effective size and reduces its rate of clearance rate from the body. PEGs are commercially 
available in several sizes, allowing the circulating half-lives of PEG-modified proteins to be 
tailored for individual indications through use of different size PEGs. Other benefits of PEG 
modification include an increase in protein solubility, an increase in in vivo protein stability 

30 and a decrease in protein immunogenicity (Katre et al., 1987; Katre, 1990). 

The preferred method for PEGylating proteins is to covalently attach PEG to cysteine 
residues using cysteine-reactive PEGs. A number of highly specific, cysteine-reactive PEGs 
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with different reactive groups (e.g., maleimide, vinylsulfone) and different size PEGs (2-20 
kDa) are commercially available (e.g., from Shearwater, Polymers/Inc., Huntsville, AL). At 
neutral pH, these PEG reagents selectively attach to "free" cysteine residues, i.e., cysteine 
residues not involved in disulfide bonds. The conjugates are hydrolytically stable. Use of 
5 cysteine-reactive PEGs allows the development of homogeneous PEG-protein conjugates of 
defined structure. 

Considerable progress has been made in recent years in determining the structures of 
commercially important protein therapeutics and understanding how they interact with their 
protein targets, e.g., cell-surface receptors, proteases, etc. This structural information can be 

10 used to design PEG-protein conjugates using cysteine-reactive PEGs. Cysteine residues in 
most proteins participate in disulfide bonds and are not available for PEGylation using 
cysteine-reactive PEGs. Through in vitro mutagenesis using recombinant DNA techniques, 
additional cysteine residues can be introduced anywhere into the protein. The added cysteines 
can be introduced at the beginning of the protein, at the end of the protein, between two 

15 amino acids in the protein sequence or, preferably, substituted for an existing amino acid in 
the protein sequence. The newly added "free" cysteines can serve as sites for the specific 
attachment of a PEG molecule using cysteine-reactive PEGs. The added cysteine must be 
exposed on the protein's surface and accessible for PEGylation for this method to be 
successful. If the site used to introduce an added cysteine site is non-essential for biological 

20 activity, then the PEGylated protein will display essentially wild type (normal) in vitro 
bioactivity. The major technical challenge in PEGylating proteins with cysteine-reactive 
PEGs is the identification of surface exposed, non-essential regions in the target protein 
where cysteine residues can be added or substituted for existing amino acids without loss of 
bioactivity. 

25 Cysteine-added variants of a few human proteins and PEG-polymer conjugates of 

these proteins have been described. US patent 5,206,344 describes cysteine-added variants of 
EL-2. These cysteine-added variants are located within the first 20 amino acids from the 
amino terminus of the mature IL-2 polypeptide chain. The preferred cysteine variant is at 
position 3 of the mature polypeptide chain, which corresponds to a threonine residue that is 

30 O-glycosylated in the naturally occurring protein. Substitution of cysteine for threonine at 
position 3 yields an IL-2 variant that can be PEGylated with a cysteine-reactive PEG and 
retain full in vitro bioactivity (Goodson and Katre, 1990). In contrast, natural IL-2 PEGylated 
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with lysine-reactive PEGs displays reduced in vitro bioactivity (Goodson and Katre, 1990). 
The effects of cysteine substitutions at other positions in IL-2 were not reported. 

US patent 5,166,322 teaches cysteine-added variants of IL-3. These variants are 
located within the first 14 amino acids from the N-terminus of the mature protein sequence. 
5 The patent teaches expression of the proteins in bacteria and covalent modification of the 
proteins with cysteine-reactive PEGs. No information is provided as to whether the cysteine- 
added variants and PEG-conjugates of IL-3 are biologically active. Cysteine-added variants 
at other positions in the polypeptide chain were not reported. 

World patent application W09412219 and PCT application US95/06540 teach 

10 cysteine-added variants of insulin-like growth factor-I (IGF-I). IGF-I has a very different 
structure from GH and is not a member of the GH supergene family (Mott and Campbell, 
1995). Cysteine substitutions at many positions in the IGF-I protein are described. Only 
certain of the cysteine-added variants are biologically active. The preferred site for the 
cysteine added variant is at amino acid position 69 in the mature protein chain. Cysteine 

15 substitutions at positions near the N-terminus of the protein (residues 1-3) yielded IGF-I 
variants with reduced biological activities and improper disulfide bonds. 

World patent application W09422466 teaches two cysteine-added variants of insulin- 
like growth factor (IGF) binding protein-1 , which has a very different structure than GH and 
is not a member of the GH supergene family. The two cysteine-added IGF binding protein-1 

20 variants disclosed are located at positions 98 and 101 in the mature protein chain and 
correspond to serine residues that are phosphorylated in the naturally-occurring protein. 

US patent application 07/822296 teaches cysteine added variants of tumor necrosis 
factor binding protein, which is a soluble, truncated form of the tumor necrosis factor cellular 
receptor. Tumor necrosis factor binding protein has a very different structure than GH and is 

25 not a member of the GH supergene family. 

IGF-I, IGF binding protein-1 and tumor necrosis factor binding protein have 
secondary and tertiary structures that are very different from GH and the proteins are not 
members of the GH supergene family. Because of this, it is difficult to use the information 
gained from studies of IGF-I, IGF binding protein-1 and tumor necrosis factor binding protein 

30 to create cysteine-added variants of members of the GH supergene family. The studies with 
IL-2 and IL-3 were carried out before the structures of IL-2 and IL-3 were known (McKay 
1992; Bazan, 1992) and before it was known that these proteins are members of the GH 
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supergene family. Previous experiments aimed at identifying preferred sites for adding 
cysteine residues to IL-2 and IL-3 were largely empirical and were performed prior to 
experiments indicating that members of the GH supergene family possessed similar secondary 
and tertiary structures. 

5 Based on the structural information now available for members of the GH supergene 

family, the present invention provides "rules" for determining a priori which regions and 
amino acid residues in members of the GH supergene family can be used to introduce or 
substitute cysteine residues without significant loss of biological activity. In contrast to the 
naturally occurring proteins, these cysteine-added variants of members of the GH supergene 

10 family will possess novel properties such as the ability to be covalently modified at defined 
sites within the polypeptide chain with cysteine-reactive polymers or other types of cysteine- 
reactive moieties. The covalently modified proteins will be biologically active. 

GH is the best-studied member of the GH supergene family. GH is a 22 kDa protein 
secreted by the pituitary gland. GH stimulates metabolism of bone, cartilage and muscle and 

15 is the body's primary hormone for stimulating somatic growth during childhood. 

Recombinant human GH (rhGH) is used to treat short stature resulting from GH inadequacy 
and renal failure in children. GH is not glycosylated and can be produced in a fully active 
form in bacteria. The protein has a short in vivo half-life and must be administered by daily 
subcutaneous injection for maximum effectiveness (MacGillivray et al, 1996). Recombinant 

20 human GH (rhGH) was approved recently for treating cachexia in AIDS patients and is under 
study for treating cachexia associated with other diseases. 

The sequence of human GH is well known (see, e.g., Martial et al. 1979; Goeddel et 
al. 1 979 which are incorporated herein by reference; SEQ ID NO: 1). GH is closely related in 
sequence to prolactin and placental lactogen and these three proteins were considered 

25 originally to comprise a small gene family. The primary sequence of GH is highly conserved 
among animal species (Abdel-Meguid et al., 1 987), consistent with the protein's broad 
species cross-reactivity. The three dimensional folding pattern of porcine GH has been 
solved by X-ray crystallography (Abdel-Meguid et al., 1 987). The protein has a compact 
globular structure, comprising four amphipathic alpha helical bundles joined by loops. 

30 Human GH has a similar structure (de Vos et al., 1 992). The four alpha helical regions are 
termed A-D beginning from the N-terminus of the protein. The loop regions are referred to 
by the helical regions they join, e.g., the A-B loop joins helical bundles A and B. The A-B 
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and C-D loops are long, whereas the B-C loop is short. GH contains four cysteine residues, 
all of which participate in disulfide bonds. The disulfide assignments are cysteine53 joined to 
cysteine 1 65 and cysteine 1 82 j oined to cysteine 189. 

The crystal structure of GH bound to its receptor revealed that GH has two receptor 
5 binding sites and binds two receptor molecules (Cunningham et al., 1991; de Vos et al, 
1992). The two receptor binding sites are referred to as site I and site n. Site I encompasses 
the Carboxy (C)-terminal end of helix D and parts of helix A and the A-B loop, whereas site 
II encompasses the Amino (N)-terminal region of helix A and a portion of helix C. Binding 
of GH to its receptor occurs sequentially, with site I always binding first. Site II then engages 

10 a second GH receptor, resulting in receptor dimerization and activation of the intracellular 
signaling pathways that lead to cellular responses to GH. A GH mutein in which site II has 
been mutated (a glycine to arginine mutation at amino acid 120) is able to bind a single GH 
receptor, but is unable to dimerize GH receptors; this mutein acts as a GH antagonist in vitro , 
presumably by occupying GH receptor sites without activating intracellular signaling 

1 5 pathways (Fuh et al., 1 992). 

The roles of particular regions and amino acids in GH receptor binding and 
intracellular signaling also have been studied using techniques such as mutagenesis, 
monoclonal antibodies and proteolytic digestion. The first mutagenesis experiments entailed 
replacing entire domains of GH with similar regions of the closely related protein, prolactin 

20 (Cunningham et al., 1989). One finding was that replacement of the B-C loop of GH with 
that of prolactin did not affect binding of the hybrid GH protein to a soluble form of the 
human GH receptor, implying that the B-C loop was non-essential for receptor binding. 
Alanine scanning mutagenesis (replacement of individual amino acids with alanine) identified 
14 amino acids that are critical for GH bioactivity (Cunningham and Wells, 1989). These 

25 amino acids are located in the helices A, B, C, and D and the A-B loop and correspond to 
sites I and II identified from the structural studies. Two lysine residues at amino acid 
positions 41 and 172, K41 and K172, were determined to be critical components of the site I 
receptor binding site, which explains the decrease in bioactivity observed when K172 is 
acetylated (Teh and Chapman, 1988). Modification of K168 also significantly reduced GH 

30 receptor binding and bioactivity (de la Llosa et al., 1985; Martal et al., 1985; Teh and 
Chapman, 1988). Regions of GH responsible for binding the GH receptor have also been 
studied using monoclonal antibodies (Cunningham et al., 1989). A series of eight 
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monoclonal antibodies was generated to human GH and analyzed for the ability to neutralize 
GH activity and prevent binding of GH to its recombinant soluble receptor. The latter studies 
allowed the putative binding site for each monoclonal antibody to be localized within the GH 
three-dimensional structure. Of interest was that monoclonal antibodies 1 and 8 were unable 
5 to displace GH from binding its receptor. The binding sites for these monoclonal antibodies 
were localized to the B-C loop (monoclonal number 1) and the N-terminal end of the A-B 
loop (monoclonal number 8). No monoclonals were studied that bound the C-D loop 
specifically. The monoclonal antibody studies suggest that the B-C loop and N-terminal end 
of the A-B loop are non-essential for receptor binding. Finally, limited cleavage of GH with 

1 0 trypsin was found to produce a two chain derivative that retained full activity (Mills et al., 
1980; Li, 1982). Mapping studies indicated that trypsin cleaved and/or deleted amino acids 
between positions 134 and 149, which corresponds to the C-D loop. These studies suggest 
the C-D loop is not involved in receptor binding or GH bioactivity. 

Structures of a number of cytokines, including G-CSF (Hill et al., 1993), GM-CSF 

15 (Diederichs et al., 1991; Walter et al., 1992), IL-2 (Bazan, 1992; McKay, 1992), IL-4 ( 
Redfield et al., 1991; Powers et al., 1992), and IL-5 (Milburn et al., 1993) have been 
determined by X-ray diffraction and NMR studies and show striking conservation with the 
GH structure, despite a lack of significant primary sequence homology. EPO is considered to 
be a member of this family based upon modeling and mutagenesis studies (Boissel et al., 

20 1993; Wen et al., 1994). A large number of additional cytokines and growth factors including 
ciliary neurotrophic factor (CNTF), leukemia inhibitory factor (LIF), thrombopoietin (TPO), 
oncostatin M, macrophage colony stimulating factor (M-CSF), IL-3, IL-6, IL-7, IL-9, IL-12, 
IL-13, IL-15, and alpha, beta, omega, tau and gamma interferon belong to this family 
(reviewed in Mott and Campbell, 1995; Silvennoinen and Ihle 1996). All of the above 

25 cytokines and growth factors are now considered to comprise one large gene family, of which 
GH is the prototype. 

In addition to sharing similar secondary and tertiary structures, members of this family 
share the property that they must oligomerize cell surface receptors to activate intracellular 
signaling pathways. Some GH family members, e.g., GH and EPO, bind a single type of 
30 receptor and cause it to form homodimers. Other family members, e.g., IL-2, IL-4, and IL-6, 
bind more than one type of receptor and cause the receptors to form heterodimers or higher 
order aggregates (Davis et al., 1993; Paonessa et al., 1995; Mott and Campbell, 1995). 
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Mutagenesis studies have shown that, like GH, these other cytokines and growth factors 
contain multiple receptor binding sites, typically two, and bind their cognate receptors 
sequentially (Mott and Campbell, 1995; Matthews et al., 1996). Like GH, the primary 
receptor binding sites for these other family members occur primarily in the four alpha helices 
5 and the A-B loop (reviewed in Mott and Campbell, 1995). The specific amino acids in the 
helical bundles that participate in receptor binding differ amongst the family members (Mott 
and Campbell, 1995). Most of the cell surface receptors that interact with members of the 
GH supergene family are structurally related and comprise a second large multi-gene family 
(Bazan, 1990; Mott and Campbell, 1995; Silvennoinen and Ihle 1996). 

10 A general conclusion reached from mutational studies of various members of the GH 

supergene family is that the loops joining the alpha helices generally tend to not be involved 
in receptor binding. In particular the short B-C loop appears to be non-essential for receptor 
binding in most, if not all, family members. For this reason, the B-C loop is a preferred 
region for introducing cysteine substitutions in members of the GH supergene family. The A- 

15 B loop, the B-C loop, the C-D loop (and D-E loop of interferon/ IL-10-like members of the 
GH superfamily) also are preferred sites for introducing cysteine mutations. Amino acids 
proximal to helix A and distal to the final helix also tend not to be involved in receptor 
binding and also are preferred sites for introducing cysteine substitutions. Certain members of 
the GH family, e.g., EPO, IL-2, IL-3, IL-4, IL-6, G-CSF, GM-CSF, TPO, IL-10, IL-12 p35, 

20 IL- 1 3 , IL- 1 5 and beta-interferon contain N-linked and O-linked sugars. The glycosylation 
sites in the proteins occur almost exclusively in the loop regions and not in the alpha helical 
bundles. Because the loop regions generally are not involved in receptor binding and because 
they are sites for the covalent attachment of sugar groups, they are preferred sites for 
introducing cysteine substitutions into the proteins. Amino acids that comprise the N- and 0- 

25 linked glycosylation sites in the proteins are preferred sites for cysteine substitutions because 
these amino acids are surface-exposed, the natural protein can tolerate bulky sugar groups 
attached to the proteins at these sites and the glycosylation sites tend to be located away from 
the receptor binding sites. 

Many additional members of the GH gene family are likely to be discovered in the 

30 future. New members of the GH supergene family can be identified through computer-aided 
secondary and tertiary structure analyses of the predicted protein sequences. Members of the 
GH supergene family will possess four or five amphipathic helices joined by non-helical 
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amino acids (the loop regions). The proteins may contain a hydrophobic signal sequence at 
their N-terminus to promote secretion from the cell. Such later discovered members of the 
GH supergenfamily also are included within this invention. 

The present invention provides "rules" for creating biologically active cysteine-added 
5 variants of members of the GH supergene family. These "rules" can be applied to any existing 
or future member of the GH supergene family. The cysteine-added variants will posses novel 
properties not shared by the naturally occurring proteins. Most importantly, the cysteine 
added variants will possess the property that they can be covalently modified with cysteine- 
reactive polymers or other types of cysteine-reactive moieties to generate biologically active 

10 proteins with improved properties such as increased in vivo half-life, increased solubility and 
improved in vivo efficacy. 

Specifically, the present invention provides biologically active cysteine variants of 
members of the GH supergene family by substituting cysteine residues for non-essential 
amino acids in the proteins. Preferably, the cysteine residues are substituted for amino acids 

15 that comprise the loop regions, for amino acids near the ends of the alpha helices and for 
amino acids proximal to the first amphipathic helix or distal to the final amphipathic helix of 
these proteins. Other preferred sites for adding cysteine residues are at the N-terminus or C- 
terminus of the proteins. Cysteine residues also can be introduced between two amino acids 
in the disclosed regions of the polypeptide chain. The present invention teaches that N- and 

20 O-linked glycosylation sites in the proteins are preferred sites for introducing cysteine 

substitutions either by substitution for amino acids that make up the sites or, in the case of N- 
linked sites, introduction of cysteines therein. The glycosylation sites can be serine or 
threonine residues that are O-glycosylated or asparagine residues that are N-glycosylated. N- 
linked glycosylation sites have the general structure asparagine-X-serine or threonine (N-X- 

25 S/T), where X can be any amino acid. The asparagine residue, the amino acid in the X 

position and the serine/threonine residue of the N-linked glycosylation site are preferred sites 
for creating biologically active cysteine-added variants of these proteins. Amino acids 
immediately surrounding or adjacent to the O-linked and N-linked glycosylation sites (within 
about 10 residues on either side of the glycosylation site) are preferred sites for introducing 

30 cysteine-substitutions. 

More generally, certain of the "rules" for identifying preferred sites for creating 
biologically active cysteine-added protein variants can be applied to any protein, not just 
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proteins that are members of the GH supergene family. Specifically, preferred sites for 
creating biologically active cysteine variants of proteins (other than IL-2) are O-linked 
glycosylation sites. Amino acids immediately surrounding the O-linked glycosylation site 
(within about 10 residues on either side of the glycosylation site) also are preferred sites. N- 
5 linked glycosylation sites, and the amino acid residues immediately adjacent on either side of 
the glycosylation site (within about 10 residues of the N-X-S/T site) also are preferred sites 
for creating cysteine added protein variants. Amino acids that can be replaced with cysteine 
without significant loss of biological activity also are preferred sites for creating cysteine- 
added protein variants. Such non-essential amino acids can be identified by performing 

10 cysteine-scanning mutagenesis on the target protein and measuring effects on biological 
activity. Cysteine-scanning mutagenesis entails adding or substituting cysteine residues for 
individual amino acids in the polypeptide chain and determining the effect of the cysteine 
substitution on biological activity. Cysteine scanning mutagenesis is similar to alanine- 
scanning mutagenesis (Cunningham et al, 1992), except that target amino acids are 

1 5 individually replaced with cysteine rather than alanine residues. 

Application of the "rules" to create cysteine-added variants and conjugates of protein 
antagonists also is contemplated. Excess production of cytokines and growth factors has been 
implicated in the pathology of many inflammatory conditions such as rheumatoid arthritis, 
asthma, allergies and wound scarring. Excess production of GH has been implicated as a 

20 cause of acromegaly. Certain growth factors and cytokines, e.g., GH and IL-6, have been 
implicated in proliferation of particular cancers. Many of the growth factors and cytokines 
implicated in inflammation and cancer are members of the GH supergene family. There is 
considerable interest in developing protein antagonists of these molecules to treat these 
diseases. One strategy involves engineering the cytokines and growth factors so that they can 

25 bind to, but not oligomerize receptors. This is accomplished by mutagenizing the second 
receptor binding site (site II) on the molecules. The resulting muteins are able to bind and 
occupy receptor sites but are incapable of activating intracellular signaling pathways. This 
strategy has been successfidly applied to GH to make a GH antagonist (Cunningham et al., 
1 992). Similar strategies are being pursued to develop antagonists of other members of the 

30 GH supergene family such as DL-2 (Zurawski et al., 1990; Zurawski and Zurawski, 1992), IL- 
4 (Kruse et al., 1992), IL-5 (Tavernier et al., 1995), GM-CSF (Hercus et al., 1994) and EPO 
(Matthews et al., 1996). Since the preferred sites for adding cysteine residues to members of 
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the GH supergene family described here lie outside of the receptor binding sites in these 
proteins, and thus removed from any sites used to create protein antagonists, the cysteine- 
added variants-described herein could be used to generate long-acting versions of protein 
antagonists. As an example, Cunningham et al. (1992) developed an in vitro GH antagonist 
5 by mutating a glycine residue (amino acid 120) to an arginine. This glycine residue is a 
critical component of the second receptor binding site in GH; when it is replaced with 
arginine, GH cannot dimerize receptors. The glycine to arginine mutation at position 120 can 
be introduced into DNA sequences encoding the cysteine-added variants of GH contemplated 
herein to create a cysteine-added GH antagonist that can be conjugated with cysteine-reactive 

10 PEGs or other types of cysteine-reactive moieties. Similarly, amino acid changes in other 
proteins that turn the proteins from agonists to antagonists could be incorporated into DNA 
sequences encoding cysteine-added protein variants described herein. Considerable effort is 
being spent to identify amino acid changes that convert protein agonists to antagonists. 
Hercus et al.(1994) reported that substituting arginine or lysine for glutamic acid at position 

15 21 in the mature GM-CSF protein converts GM-CSF from an agonist to an antagonist. 

Tavernier et al.(1995) reported that substituting glutamine for glutamic acid at position 13 of 
mature IL-5 creates an IL-5 antagonist. 

Experimental strategies similar to those described above can be used to create 
cysteine-added variants (both agonists and antagonists) of members of the GH supergene 

20 family derived from various animals. This is possible because the primary amino acid 
sequences and structures of cytokines and growth factors are largely conserved between 
human and animal species. For this reason, the "rules" disclosed herein for creating 
biologically active cysteine-added variants of members of the GH supergene family will be 
useful for creating biologically active cysteine-added variants of members of the GH 

25 supergene family of companion animals (e.g., dogs, cats, horses) and commercial animal 
(e.g., cow, sheep, pig) species. Conjugation of these cysteine-added variants with cysteine- 
reactive PEGs will create long-acting versions of these proteins that will benefit the 
companion animal and commercial farm animal markets. 

Proteins that are members of the GH supergene family (hematopoietic cytokines) are 

30 provided in Silvennoimem and Ihle (1996). Silvennoimem and Ihle (1 996) also provide 
information about the structure and expression of these proteins. DNA sequences, encoded 
amino acids and in vitro and in vivo bioassays for the proteins described herein are described 
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in Aggarwal and Gutterman (1992; 1996), Aggarwal (1998), and Silvennoimem and Ihle 
(1996). Bioassays for the proteins also are provided in catalogues of various commercial 
suppliers of these proteins such as R&D Systems, Inc. and Endogen, Inc. 

The following examples are provided to demonstrate how these "rules" can be used to 
create cysteine-added variants of GH, erythropoietin, alpha interferon, beta interferon, G- 
CSF, GM-CSF and other members of the GH supergene family. The examples are not 
intended to be limiting, but only exemplary of specific embodiments of the invention. 

Example 1 
Cysteine-added variants of GH 

This example discloses certain amino acids in GH that are non-essential for biological 
activity and which, when mutated to cysteine residues, will not alter the normal disulfide 
binding pattern and overall conformation of the molecule. These amino acids are located at 
the N-terminal end of the A-B loop (amino acids 34-52 of the mature protein sequence; SEQ 
ID NO: 1; Martial et al 1979; Goeddel et al 1979), the B-C loop (amino acids 97-105 in the 
mature protein sequence), and the C-D loop (amino acids 130-153 in the mature protein 
sequence). Also identified as preferred sites for introducing cysteine residues are the first 
three or last three amino acids in the A, B, C and D helices and the amino acids proximal to 
helix A and distal to helix D. 

DNA sequences encoding wild type GH can be amplified using the polymerase chain 
reaction technique from commercially available single-stranded cDNA prepared from human 
pituitaries (ClonTech, San Diego, CA) or assembled using overlapping oligonucleotides. 
Specific mutations can be introduced into the GH sequence using a variety of procedures such 
as phage techniques (Kunkel et al 1987), PCR mutagenesis techniques (Innis et al 1990; 
White 1993) mutagenesis kits such as those sold by Stratagene ("Quick-Change Mutagenesis" 
kit, San Diego, CA) or Promega (Gene Editor Kit, Madison WI). 

Cysteine substitutions can be introduced into any of the amino acids comprising the 
B-C loop, C-D loop and N-tenninal end of the A-B loop or into the first three amino acids of 
the alpha helical regions that adjoin these regions, or in the region proximal to helix A or 
distal to helix D. Preferred sites for introduction of cysteine residues are: Fl, T3, P5, E33, 
A34, K38, E39, Q40, S43, Q46, N47, P48, Q49, T50, S51, S55, T60, A98, N99, S100, G104, 
A105, S106, E129, D130, G131, S132, P133, T135, G136, Q137, K140, Q141, T142, S144, 
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K145, D147, T148, N149, S150, H151, N152, D153, S184, E186, G187, S188, and G190. 
Cysteine residues also can be introduced at the beginning of the mature protein, i.e., proximal 
to the Fl amino acid, or following the last amino acid in the mature protein, i.e., following 
F191 . If desirable, two or more such mutations can be readily combined in the same protein 
5 either by in vitro DN A recombination of cloned mutant genes and/or sequential construction 
of individual desired mutations. 
1 . Cloning the Gene for human Growth Hormone (GH) 

The human GH gene was amplified from human pituitary single-stranded cDNA 
(commercially available from CLONTECH, Inc., Palo Alto, CA) using the polymerase chain 

10 reaction (PCR) technique and primers BB1 and BB2. The sequence of BB1 is 5'- 

GGGGGTCGACCATATGTTCCCAACCATTCCCTTATCCAG-3' (SEQIDNO: 24). The 
sequence of BB2 is 5'- GGGGGATCCTCACTAGAAGCCACAGCTGCCCTC - 3' (SEQ ID 
NO: 25). Primer BB1 was designed to encode an initiator methionine preceding the first 
amino acid of mature GH, phenylalanine, and Sail and Ndel sites for cloning purposes. The 

1 5 reverse primer, BB2, contains a BamH I site for cloning purposes. The PCR 1 00 microliter 
reactions contained 20 pmoles of each oligonucleotide primer, IX PCR buffer (Perkin-Elmer 
buffer containing MgCb), 200 micromolar concentration of each of the four nucleotides dA, 
dC, dG and dT, 2 ng of single-stranded cDNA, 2.5 units of Taq polymerase (Perkin-Elmer) 
and 2.5 units of Pfu polymerase (Stratagene, Inc). The PCR reaction conditions were 96°C 

20 for 3 minutes, 35 cycles of (95°C, 1 minute; 63°C for 30 seconds; 72°C for 1 minute), 

followed by 10 minutes at 72°C. The thermocycler employed was the Amplitron II Thermal 
Cycler (Thermolyne). The approximate 600 bp PCR product was digested with Sail and 
BamHI, gel purified and cloned into similarly digested plasmid pUC19 (commercially 
available from New England BioLabs, Beverly, MA). The ligation mixture was transformed 

25 into E. coli strain DHSalpha and transformants selected on LB plates containing ampicillin. 
Several colonies were grown overnight in LB media and plasmid DNA isolated using 
miniplasmid DNA isolation kits purchased from Qiagen, Inc (Valencia, CA). Clone LB6 was 
determined to have the correct DNA sequence. 

For expression in E. coli, clone LB6 was digested with Nde l and EcoRI, the 

30 approximate 600 bp fragment gel-purified, and cloned into plasmid pCYBl (commercially 
available from New England BioLabs, Beverly, MA) that had been digested with the same 
enzymes and phosphatased. The ligation mixture was transformed into E. coli DHSalpha and 
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transformants selected on LB ampicillin plates. Plasmid DNA was isolated from several 
transformants and screened by digestion with Ndel and EcoRl A correct clone was identified 
and named pCYBl : wtGH (pBBT120). This plasmid was transformed into E. coli strains 
JM109 or W31 10 (available from New England BioLabs and the American Type Culture 
5 Collection). 

2. Construction of STII-GH 

Wild type GH clone LB6 (pUC19: wild type GH) was used as the template to 
construct a GH clone containing the E. coli STII signal sequence (Picken et al. 1983). 
Because of its length, the STII sequence was added in two sequential PCR reactions. The 
1 0 first reaction used forward primer BB 1 2 and reverse primer BB 1 0. BB 1 0 has the sequence: 
5 1 CGCGG ATCCG ATTAG AATCC AC AGCTCCCCTC 3' (SEQ ID NO: 28). 
BB12 has the sequence: 

S'ATCTATGTTCGTTTTCTCTATCGCTACCAACGCTTACGCATTCCCAACCATTCCC 

TTATCCAG-3' (SEQ ED NO: 30). 
1 5 The PCR reactions were as described for amplifying wild type GH except that 

approximately 4 ng of plasmid LB6 was used as the template rather than single-stranded 

cDNA and the PCR conditions were 96°C for 3 minutes, 30 cycles of (95°C for 1 minute; 

63°C for 30 seconds; 72°C for 1 minute) followed by 72°C for 10 minutes. The approximate 

630 bp PCR product was gel-purified using the Qiaex II Gel Extraction Kit (Qiagen, Inc), 
20 diluted 50-fold in water and 2 microliters used as template for the second PCR reaction. The 

second PCR reaction used reverse primer BB10 and forward primer BB1 1 , BB1 1 has the 

sequence: 

5 ' CCCCCTCTAGAC ATATGAAG AAG AAC ATCGC ATTCCTGCTGGC ATCTATGTTCG 
TTTTCTCTATCG-3 ' (SEQ ID NO: 29). 

25 Primer BB 1 1 contains Xbal and Nde l sites for cloning purposes. PCR conditions 

were as described for the first reaction. The approximate 660 bp PCR product was digested 
with Xbal and BamHI, gel-purified and cloned into similarly cut plasmid pCDNA3.1(+) ( 
Invitrogen, Inc. Carlsbad, CA). Clone pCDNA3.1(+)::stH-GH(5C) or "5C" was determined 
to have the correct DNA sequence. 

30 Clone "5C" was cleaved with Nde l and BamHI and cloned into similarly cut 

pBBT108 (a derivative of pUC19 which lacks a Pst I site, this plasmid is described below). A 
clone with the correct insert was identified following digestion with these enzymes. This 
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clone, designated pBBTl 1 1, was digested with Nde l and Sail, the 660 bp fragment containing 
the stll-GH fusion gene, was gel-purified and cloned into the plasmid expression vector 
pC YB 1 ( New England BioLabs) that had been digested with the same enzymes and 
phosphatased. A recombinant plasmid containing the stll-GH insertion was identified by 
5 restriction endonuclease digestions. One such isolate was chosen for further studies and was 
designated pBBTl 14. This plasmid was transformed into E. coli strains JM109 or W31 10 
(available from New England BioLabs and the American Type Culture Collection). 
3. Construction of ompA-GH 

Wild type GH clone LB6 (pUC19: wild type GH) was used as the template to 
10 construct a GH clone containing the E. coli ompA signal sequence (Mowa et al 1 980). 
Because of its length, the ompA sequence was added in two sequential PCR reactions. The 
first reaction used forward primer BB7: 

5 ' GCAGTGGCACTGGCTGGTTTCGCTACCGTAGCGCAGGCCTTCCCAACC ATTCCC 
TTATCCAG V (SEQ ID NO: 31), 

15 and reverse primer BB1 0: 

5' CGCGGATCCGATTAGAATCCACAGCTCCCCTC 3' (SEQ ID NO: 28). 

The PCR reactions were as described for amplifying wild type GH except that 
approximately 4 ng of plasmid LB6 was used as the template rather than single-stranded 
cDNA and the PCR conditions were 96°C for 3 minutes, 30 cycles of (95°C for 1 minute; 

20 63°C for 30 seconds; 72°C for 1 minute) followed by 72°C for 1 0 minutes. The approximate 
630 bp PCR product was gel-purified using the Qiaex II Gel Extraction Kit (Qiagen, Inc), 
diluted 50-fold in water and 2 microliters used as template for the second PCR reaction. The 
second PCR reaction used reverse primer BB10 and forward PrimerBB6: 
5 ' CCCCGTCGAC AC ATATGAAG AAG AC AGCTATCGCGATTGC AGTGGC ACTGGCT 

25 GGTTTC 3' (SEQ ID NO: 32). 

PCR conditions were as described for the first reaction. The approximate 660 bp PCR 
product was gel-purified, digested with Sal I and Bam HI and cloned into 
pUC19 (New England BioLabs) which was cut with Sal I and Bam HI or pCDNA3.1(+) 
(Invitrogen) which had been cut by Xho I and Bam HI (Sail and Xho I produce compatible 

30 single-stranded overhangs). When several clones were sequenced, it was discovered that all 
pUC19 clones (8/8) contained errors in the region of the ompA sequence. Only one 
pCDNA3.1(+) clone was sequenced and it contained a sequence ambiguity in the ompA 
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region. In order to generate a correct ompA-GH fusion gene segments of two sequenced 
clones which contained different errors separated by a convenient restriction site were 
recombined and cloned into the pUC19-derivative that lacks the Pst I site (see pBBT108 
described below). The resulting plasmid, termed pBBTl 12, carries the ompA-GH fusion gene 
5 cloned as an Nde I - Bam HI fragment into these same sites in pBBT108. This plasmid is 
designated pBBTl 12 and is used in PCR-based, site-specific mutagenesis of GH as described 
below. 

4. Construction of Pst- pUC19 

To facilitate mutagenesis of the cloned GH gene for construction of selected cysteine 
1 0 substitution and insertion mutations a derivative of the plasmid pUC 1 9 (New England 

BioLabs) lacking a Pst I site was constructed as follows. pUCl 9 plasmid DNA was digested 
with Pst I and subsequently treated at 75 deg. C with PFU DNA Polymerase (Stratagene) 
using the vendor-supplied reaction buffer supplemented with 200uM dNTPs. Under these 
conditions the polymerase will digest the 3' single-stranded overhang created by Pst I 
15 digestion but will not digest into the double-stranded region. The net result will be the 
deletion of the 4 single-stranded bases which comprise the middle four bases of the Pst I 
recognition site. The resulting molecule has double-stranded, i.e., "blunt", ends. Following 
these enzymatic reactions, the linear monomer was gel-purified using the Qiaex II Gel 
Extraction Kit (Qiagen, Inc). This purified DNA was treated with T4 DNA Ligase (New 
20 England BioLabs) according to the vendor protocols, digested with Pst I, and used to 

transform Ecoli DHSalpha. Transformants were picked and analyzed by restriction digestion 
with Pst I and Bam HI . One of the transformants which was not cleaved by Pst I but was 
cleaved at the nearby Bam HI site was picked and designated pBBT108. 

5, Construction of GH muteins 

25 GH muteins were generally constructed using site-directed PCR-based mutagenesis as 

described in PCR Protocols: Current Methods and Applications edited by B. A. White, 1993 
Humana Press, Inc., Totowa, NJ and PCR Protocols: A Guide to Methods and Applications 
edited by Innis, M. A. et al 1990 Academic Press Inc San Diego, CA . Typically PCR primer 
oligonucleotides are designed to incorporate nucleotide changes to the coding sequence of 

30 GH that result in substitution of a cysteine residue for an amino acid at a specific position 
within the protein. Such mutagenic oligonucleotide primers can also be designed to 
incorporate an additional cysteine residue at the carboxy terminus or amino terminus of the 
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coding sequence of GH. In this latter case one or more additional amino acid residues could 
also be incorporated at the amino terminal and/or carboxy terminal to the added cysteine 
residue if that were desirable. Moreover, oligonucleotides can be designed to incorporate 
cysteine residues as insertion mutations at specific positions within the GH coding sequence 
5 if that were desirable. Again, one or more additional amino acids could be inserted along 
with the cysteine residue and these amino acids could be positioned at the amino terminal 
and/or carboxy terminal to the cysteine residue. 

The cysteine substitution mutation T135C was constructed as follows. The mutagenic 
reverse oligonucleotide BB28: 

10 5 * CTGCTTGAAGATCTGCCC ACACCGGGGGCTGCC ATC3 ' (SEQ ID NO: 33) 

was designed to change the codon ACT for threonine at amino acid residue 135 to a TGT 
codon encoding cysteine and to span the nearby Bgl II site. This oligonucleotide was used in 
PCR along with the forward oligonucleotide BB34 

5 ' GTAGCGC AGGCCTTCCC AACC ATT3 ' (SEQ ID NO: 34) which anneals to the junction 

1 5 region of the ompA-GH fusion gene and is not mutagenic. The PCR was performed in a 50ul 
reaction in IX PCR buffer (Perkin-Elmer buffer containing 1 .5 mM MgCl 2 ), 200 micromolar 
concentration of each of the four nucleotides dA, dC, dG and dT, with each oligonucleotide 
primer present at 0.5 \M, 5 pg of pBBTl 12 (described above) as template and 1.25 units of 
Amplitac DNA Polymerase ( Perkin-Elmer) and 0.125 units of PFU DNA Polymerase 

20 (Stratagene). Reactions were performed in a Robocycler Gradient 96 thermal cycler 

(Stratagene). The program used entailed: 95 deg C for 3 minutes followed by 25 cycles of 95 
deg C for 60 seconds, 45 deg C or 50 deg C or 55 deg C for 75 seconds, 72 deg C for 60 
seconds followed by a hold at 6 deg C. The PCR reactions were analyzed by agarose gel 
electrophoresis to identify annealing temperatures that gave significant product of the 

25 expected size; -430 bp. The 45-deg C reaction was "cleaned up" using the QIAquick PCR 
Purification Kit (Qiagen), digested with Bgl II and Pst L The resulting 278 bp Bgl II- PstI 
fragment, which includes the putative T135C mutation, was gel-purified and ligated into 
pBBTl 1 1 the pUC19 derivative canying the stll-GH fusion gene (described above) which 
had been digested with Bgl II and Pst I and gel-purified. Transformants from this ligation 

30 were initially screened by digestion with Bgl II and Pst I and subsequently one clone was 
sequenced to confirm the presence of the T135C mutation and the absence of any additional 
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mutations that could potentially be introduced by the PCR reaction or by the synthetic 
oligonucleotides. The sequenced clone was found to have the correct sequence. 

The substitution mutation S132C was constructed using the protocol described above 
for T135C with the following differences: mutagenic reverse oligonucleotide BB29 
5 5 5 CTGCTTGAAG ATCTGCCCAGTCCGGGGGC AGCCATCTTC3 ' (SEQ ID NO: 35) was 
used instead of BB28 and the PCR reaction with annealing temperature of 50 deg C was used 
for cloning. One of two clones sequenced was found to have the correct sequence. 

The substitution mutation T148C was constructed using an analogous protocol but 
employing a different cloning strategy. The mutagenic forward oligonucleotide BB30 

10 5'GGGCAGATCTTCAAGCAGACCTACAGCAAGTTCGACTGCAACTCACACAAC3' 
(SEQ ID NO: 36) was used in PCR with the non-mutagenic reverse primer BB33 
5 ' CGCGGTACCCGGG ATCCG ATTAG AATCC AC AGCT3 ' (SEQ ID NO: 37) which 
anneals to the most 3* end of the GH coding sequence and spans the Bam HI site 
immediately downstream. PCR was performed as described above with the exception that the 

15 annealing temperatures used were 46, 51 and 56 deg C. Following PCR and gel analysis as 
described above the 46 and 51 deg C reactions were pooled for cloning. These were digested 
with Bam HI and BgLII, gel-purified and cloned into pBBTl 1 1 which had been digested with 
Bam HI and BgLII, treated with Calf intestinal Alkaline Phosphatase (Promega) according to 
the vendor protocols, and gel-purified. Transformants from this ligation were analyzed by 

20 digestion with Bam HI and Bel li to identify clones in which the 188 bp Bam HI- Bel II 
mutagenic PCR fragment was cloned in the proper orientation. Because Bam HI and BgLII 
generate compatible ends, this cloning step is not orientation specific. Five of six clones 
tested were shown to be correctly oriented. One of these was sequenced and was shown to 
contain the desired T148C mutation. The sequence of the remainder of the 1 88 bp Bam Hl- 

25 BgLII mutagenic PCR fragment in this clone was confirmed as correct. 

The construction of the substitution mutation S144C was identical to the construction 
of T148C with the following exceptions. Mutagenic forward oligonucleotide BB3 1 
5 * GGGC AGATCTTC AAGC AGACCTACTGC AAGTTCGAC3 ' (SEQ ID NO: 38) was used 
instead of BB30. Two of six clones tested were shown to be correctly oriented. One of these 

30 was sequenced and was shown to contain the desired S144C mutation. The sequence of the 
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remainder of the 1 88 bp Bam HI- BgHI mutagenic PGR fragment in this clone was 
confirmed as correct. 

A mutation was also constructed that added a cysteine residue to the natural carboxy 
terminus of GH, The construction of this mutation, termed stpl92C, was similar to that of 
5 T148C, but employed different oligonucleotide primers. The reverse mutagenic 
oligonucleotide BB32 

5 ' CGCGGTACCGGATCCTTAGC AG AAGCC AC AGCTGCCCTCC AC3 ' (SEQ ID NO: 
39) which inserts a TGC codon for cysteine between the codon for the carboxy terminal phe 
residue of GH and the TAA translational stop codon and spans the nearby Bam HI site was 

10 used along with BB34 5 'GTAGCGCAGGCCTTCCCAACCATT3 ' (SEQ ID NO: 40) which 
is described above. Following PCR and gel analysis as described above, the 46 deg C reaction 
was used for cloning. Three of six clones tested were shown to be correctly oriented. One of 
these was sequenced and was shown to contain the desired stpl92C mutation. The sequence 
of the remainder of the 188 bp Bam HI - BgUI mutagenic PCR fragment in this clone was 

1 5 confirmed as correct 

Analogous PCR mutagenesis procedures can be used to generate other cysteine 
mutations. The choice of sequences for mutagenic oligonucleotides will be dictated by the 
position where the desired cysteine residue is to be placed and the propinquity of useful 
restriction endonuclease sites. Generally, it is desirable to place the mutation, i.e., the 

20 mismatched segment near the middle of the oligonucleotide to enhance the annealing of the 
oligonucleotide to the template. Appropriate annealing temperatures for any oligonucleotide 
can be determined empirically. It is also desirable for the mutagenic oligonucleotide to span a 
unique restriction site so that the PCR product can be cleaved to generate a fragment that can 
be readily cloned into a suitable vector, e.g., one that can be used to express the mutein or 

25 provides convenient restriction sites for excising the mutated gene and readily cloning it into 
such an expression vector. Sometimes mutation sites and restriction sites are separated by 
distances that are greater than that which is desirable for synthesis of synthetic 
oligonucleotides: it is generally desirable to keep such oligonucleotides under 80 bases in 
length and lengths of 30-40 bases are more preferable. 

30 In instances where this is not possible, genes targeted for mutagenesis could be re- 

engineered or re-synthesized to incorporate restriction sites at appropriate positions. 
Alternatively, variations of PCR mutagenesis protocols employed above, such as the so-called 
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"Megaprimer Method" (Barik, S., pp. 277-286 in Methods in Molecular Biology, Vol. 15: 
PCR Protocols: Current Methods and Applications edited by B. A. White, 1993, Humana 
Press, Inc., Tojtowa,NJ) or "Gene Splicing by Overlap Extension" (Horton, R. M., pp. 251- 
261, in Methods in Molecular Biology, Vol. 15: PCR Protocols: Current Methods and 

5 Applications, edited by B. A. White, 1993, Humana Press, Inc. , Totowa,NJ) can also be 
employed to construct such mutations. 
6. Expression of GH in nCYBl 

To express GH in Rcoli, pBBT120 (GH gene with no leader sequence cloned into the 
tac expression vector pCYBl) and pBBTl 14 (GH gene with stll leader sequence cloned into 

10 the tac expression vector pCYBl) were transformed into E. coli strains JM109 and W3 1 10. 
The parental vector pCYBl was also transformed into JM109 and W31 10. 
These strains were given the following designations: 



BOB119 


JM109 (pCYBl) 


BOB130 


W3110(pCYBl) 


BOB129 


JM109 (pBBT120) 


BOB 133 


W3110(pBBT120) 


BOB121 


JM109(pBBT114) 


BOB132 


W3110(pBBT114) 



For expression, strains were grown overnight at 37 0 C in Luria Broth (LB) 
20 (Sambrook, et al 1989) containing 100 (ig/ml ampicillin. These saturated overnight cultures 
were diluted to - 0.03 OD at Am in LB containing 100 ng/ml ampicillin and incubated at 37 
0 C in shake flasks in rotary shaker, typically at 250-300 rpm. ODs were monitored and IPTG 
was added to a final concentration of 0.5 mM when culture ODs reached -0.25 -0.5, typically 
between 0.3 and 0.4. Cultures were sampled typically at 1, 3, 5 and -16 h post-induction. The 
25 "-16 h" time points represented overnight incubation of the cultures and exact times varied 
from -1 5-20 h. Samples of induced and uninduced cultures were pelleted by centrifiigation, 
resuspended in IX sample buffer (50mM Tris-HCl (pH 6.8), 2% sodium lauryl sulfate, 10% 
glycerol, 0.1% bromphenol blue) with the addition of 1% p-mercaptoethanol when desirable. 
Samples were boiled for -10 minutes or heated to 95 0 C for - 10 minutes. Samples were 
30 cooled to room temperature before being loaded onto SDS polyacrylamide gels or were stored 
at -20° C if not run immediately. Samples were run on precast 15% polyacrylamide "Ready 
Gels" (Bio-Rad, Hercules CA) using a Ready Gel Cell electrophoresis apparatus (Bio-Rad) 
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according to the vendor protocols. Typically gels were run at 200 volts for -35-45 minutes. 
Gels were stained with Coomassie Blue or were analyzed by Western Blot following electro- 
blotting. Coomassie staining of whole cell lysates from strains BOB 129, BOB 133, BOB 
121 and BOB 132 showed a band of -22 kD that co-migrated with purified recombinant 
human GH standard purchased from Research Diagnostics Inc (Flanders, NJ). That band was 
most prominent in induced cultures following overnight induction. However, a band was also 
observed at that molecular weight in uninduced cultures of these same strains and could also 
be observed with and without induction in the BOB 1 19 and BOB 130 control strains that 
carried the expression vector pCYBl lacking the GH gene. To clarify this observation, 
Western Blot analyses were performed on whole cell lysates of induced cultures of strains 
BOB1 19, BOB130, BOB129, BOB133, BOB121, and BOB132. Western blots were 
performed with polyclonal rabbit anti-human GH antiserum purchased from United States 
Biological; catalogue # G 9000-1 1 (Swampscott, MA) This primary antibody was used at a 
1 :5000 dilution and its binding was detected with goat anti-rabbit IgG Fc conjugated to 
alkaline phosphatase, (product #31341) purchased from Pierce (Rockford, IL). This 
secondary antibody was used at a 1 : 10,0000 dilution. Alkaline phosphatase activity was 
detected using the ImmunoPure® Fast Red TR/AS-MX Substrate Kit (Pierce, Rockford IL) 
according to the vendor protocols. The Western Blots clearly demonstrated presence of GH 
in lysates of induced cultures of BOB129, BOB133, BOB121, and BOB132 at both 3 and 16 
h post-induction. In the induced culture of control strains, BOB1 19 and BOB 130, no GH was 
detected by Western blot at 3 or 16 h post-induction time points. 

In these preliminary experiments, the highest yields of GH were obtained from 
BOB132 W31 10(pBBTl 14) in which the GH gene is fused downstream of the stn secretion 
signal sequence. This strain was tested further to determine if the GH protein was secreted to 
the periplasm as would be expected. An induced culture of BOB132 was prepared as 
described above and subjected to osmotic shock according to the procedure of Koshland and 
Botstein(Cell20 (1980) pp. 749-760). This procedure ruptures the outer membrane and 
releases the contents of the periplasm into the surrounding medium. Subsequent 
centrifugation separates the periplasmic contents, present in the supernatant from the 
remainder of the cell-associated components. In this experiment, the bulk of the GH 
synthesized by BOB132 was found to be localized to the periplasm. This result is consistent 
with the finding that the bulk of the total GH is also indistinguishable in size from the 



23 



WO 99/03887 



PCT/US98/14497 



purified GH standard, which indicated that the stll signal sequence had been removed. This 
is indicative of secretion. A larger scale (500 ml) culture of BOB 132 was also induced, 
cultured overnight and subjected to osmotic shock according to the procedure described by 
Hsiung et al, 1986 (Bio/Technology 4, pp. 991-995). Gel analysis again demonstrated that 
5 the bulk of the GH produced was soluble, periplasmic, and indistinguishable in size from the 
GH standard. This material could also be quantitatively bound to, and eluted from, a Q- 
Sepharose column using conditions very similar to those described for recombinant human 
GH by Becker and Hsiung, 1986 (FEBS Lett 204 ppl45-150). 
7. Cloning the human GH receptor 
10 The human GH receptor was cloned by PCR using forward primer BB3 and reverse 

primer BB4. BB3 has the sequence: 

5'-CCCCGGATCCGCCACCATGGATCTCTGGCAGCTGCTGTT-3' (SEQ ID NO: 26). 
BB4 has the sequence: 

5 ' CCCCGTCGACTCTAGAGCTATTAAATACGTAGCTCTTGGG-3 ' (SEQ ID NO: 27). 

1 5 The template was single-stranded cDNA prepared from human liver (commercially available 
from CLONTECH Laboratories). Primers BB3 and BB4 contain BamHI and Sail restriction 
sites, respectively, for cloning purposes. The 100 ^1 PCR reactions contained 2.5 ng of the 
single-stranded cDNA and 20 picomoles of each primer in IX PCR buffer (Perkin-Elmer 
buffer containing MgCl 2 ), 200 micromolar concentration of each of the four nucleotides dA, 

20 dC, dG and dT, 2.5 units of Taq polymerase (Perkin-Elmer) and 2.5 units of Pfu polymerase 
(Stratagene, Inc). The PCR reaction conditions were: 96°C for 3 minutes, 35 cycles of (95°C, 
1 minute; 58°C for 30 seconds; 72°C for 2 minutes), followed by 10 minutes at 72°C. The 
thermocycler employed was the Amplitron II Thermal Cycler (Thermolyne). The 
approximate 1 .9 kb PCR product was digested with BamH I and Sail and ligated with 

25 similarly cut plasmidpUCl 9 ( New England BioLabs). However, none of the transformants 
obtained from this ligation reaction contained the 1 .9 kb PCR fragment. Leung et al (Nature 
1987 330 pp537-543) also failed to obtain full-length cDNA clones of the human GH 
receptor in pUC19. Subsequently the PCR fragment was cloned into a low copy number 
vector, pACYC184 (New England BioLabs) at the BamHI and Sail sites in this vector. Such 

30 clones were obtained at reasonable frequencies but Ecoli strains carrying the cloned PCR 
fragment grew poorly, forming tiny and heterogeneous looking colonies, in the presence of 
chloramphenicol, which is used to select for maintenance of pACYC184. 



* 
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The PGR fragment was simultaneously cloned into pCDNA3.1 (+) (Invitrogen). The 
approximate 1 .9 kb PCR product was digested with BamHI and Sail and ligated into the 
BamHI and Xhol cloning sites of pCDNA3.1 (+). Only infrequent transformants from this 
ligation contained the cloned GH receptor cDNA and all of those were found to contain 
5 deletions of segments of the receptor coding sequence. One of these clones was sequenced 
and found to contain a deletion of 135 bp within the GH receptor coding sequence: the 
sequence of the rest of the gene was in agreement with that reported by Leung et at (1987). 
8. Cloning the rabbit GH receptor 

The rabbit GH receptor was cloned by PCR using forward primer BB3 (described 

10 above) and reverse primer BB36. BB36 has the sequence 

5 'CCCCGTCGACTCTAGAGCCATTAGATACAAAGCTCTTGGG3 ' (SEQ ID NO: 41) 
and contains Xbal and Sal I restriction sites for cloning purposes. Rabbit liver poly(A) + 
mRNA was purchased from CLONTECH, Inc. and used as the substrate in first strand 
synthesis of single-stranded cDNA to produce template for PCR amplification. First strand 

15 synthesis of single-stranded cDNA was accomplished using a 1st Strand cDNA Synthesis Kit 
for RT-PCR (AMV) kit from Boehringer Mannheim Corp (Indianapolis, IN) according to the 
vendor protocols. Parallel first strand cDNA syntheses were performed using random 
hexamers or BB36 as the primer. Subsequent PCR reactions with the products of the first 
strand syntheses as templates, were carried out with primers BB3 and BB36 according to the 

20 1 st Strand cDNA Synthesis Kit for RT-PCR (AMV) kit protocol and using 2.5 units of 
Amplitac DNA Polymerase (Perkin-Elmer) and 0.625 units of Pfu DNA Polymerase 
(Stratagene). The PCR reaction conditions were 96°C for 3 minutes, 35 cycles of (95°C, 1 
minute; 58°C for 30 seconds; 72°C for 2 minutes), followed by 1 0 minutes at 72°C. The 
thermocycler employed was the Amplitronll Thermal Cycler (Thermolyne). The expected ~ 

25 1 .9 kb PCR product was observed in PCR reactions using random hexamer-primed or BB36 
primed cDNA as template. The random hexamer-primed cDNA was used in subsequent 
cloning experiments. It was digested with Bam HI and Xbal and run out over a 1 .2% agarose 
gel. This digest generates two fragments (~ 365 bp and ~ 1600 bp) because the rabbit GH 
receptor gene contains an internal Bam HI site. Both fragments were gel-purified. Initially 

30 the -1600 bp Bam HI - Xbd fragment was cloned into pCDNA3.1(+) which had been 
digested with these same two enzymes. These clones were readily obtained at reasonable 
frequencies and showed no evidence of deletions as determined by restriction digests and 
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subsequent sequencing. To generate a foil length clone, one of the plasmids containing the 
1600 bp Bam HI -Xbal fragment (pCDNA3.1(+):: rab-ghr-2A) was digested with Bam HI, 
treated with Calf Intestinal Alkaline Phosphatase (Promega) according to the vendor 
protocols, gel-purified and ligated with the gel purified -365 bp Bam HI fragment that 
5 contains the 5 1 portion of the rabbit GH receptor gene. Transformants from this ligation were 
picked and analyzed by restriction digestion and PCR to confirm the presence of the -365 bp 
fragment and to determine its orientation relative to the distal segment of the rabbit GH 
receptor gene. Three out of four clones analyzed were found to contain the -365 bp fragment 
cloned in the correct orientation for reconstitution of the rabbit GH receptor gene. The lack 

10 of complications in the cloning in E coli of the rabbit gene, in contrast to the human gene, is 
consistent with the results of Leung et al (1987) who also readily obtained full length cDNA 
clones for the rabbit GH receptor gene but were unable to clone a foil length cDN A of the 
human gene in E coli. The rabbit GH receptor can be employed in assays with human GH as a 
ligand as it has been shown that the human GH binds the rabbit receptor with high affinity 

15 (Leung et al 1987). Plasmids containing the cloned rabbit GH receptor should be sequenced 
to identify a rabbit GH receptor cDNA with the correct sequence before use. 
9.Construction of a human/rabbit Chimeric GH Receptor Gene 

As an alternative to the rabbit receptor, a chimeric receptor could be constructed 
which combines the extracellular domain of the human receptor with the transmembrane and 

20 cytoplasmic domains of the rabbit receptor. Such a chimeric receptor could be constructed by 
recombining the human and rabbit genes at the unique Nco I site that is present in each 
(Leung et al 1987). Such a recombinant, containing the human gene segment located 5' to, or 
"upstream" of the Nco I site and the rabbit gene segment 3' to, or "downstream" of the Nco I 
site would encode a chimeric receptor of precisely the desired type, having the extracellular 

25 domain of the human receptor with the transmembrane and cytoplasmic domains of the rabbit 
receptor. This would allow analysis of the interaction of GH, GH muteins, and PEGylated 
GH muteins with the natural receptor binding site but could avoid the necessity of cloning the 
foil length human GH receptor in Ecoli . 

The GH muteins can be expressed in a variety of expression systems such as bacteria, 

30 yeast or insect cells. Vectors for expressing GH muteins in these systems are available 

commercially from a number of suppliers such as Novagen, Inc. (pET15b for expression in E. 
coli), New England Biolabs (pC4Bl for expression in E. coli) Invitrogen (pVL1392, 
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pVL1393, and pMELBAC for expression in insect cells using Baculovirus vectors, Pichia 
vectors for expression in yeast cells, and pCDNA3 for expression in mammalian cells). GH 
has been successfully produced in E. coli as a cytoplasmic protein and as a secreted, 
periplasmic, protein using the E. coli OmpA or STII signal sequences to promote secretion of 
5 the protein into the periplasmic space (Chang et al., 1987; Hsiung et al., 1986). It is 

preferable that the GH muteins are expressed as secreted proteins so that they do not contain 
an N-terminal methionine residue, which is not present in the natural human protein. For 
expression in E. coli, DNA sequences encoding GH or GH muteins can be cloned into E. coli 
expression vectors such as pET15b that uses the strong T7 promoter or pCYBl that uses the 

10 TAG promoter. Adding IPTG (isopropylthiogalactopyranoside, available form Sigma 

Chemical Company) to the growth media, can induce expression of the protein. Recombinant 
GH will be secreted into the periplasmic space from which it can be released by, and 
subsequently purified, following osmotic shock (Becker and Hsiung, 1986). The protein can 
be purified further using other chromatographic methods such as ion-exchange, hydrophobic 

1 5 interaction, size-exclusion and reversed phase chromatography, all of which are well known 
to those of skill in the art (e.g., see Becker and Hsiung, 1986). Protein concentrations can be 
determined using commercially available protein assay kits such as those sold by BioRad 
Laboratories (Richmond, CA). If the GH proteins are insoluble when expressed in E. coli 
they can be refolded using procedures well known to those skilled in the art (see Cox et al., 

20 1994 and World patent applications W09422466 and W09412219). 

Alternatively, the proteins can be expressed in insect cells as secreted proteins. The 
expression plasmid can be modified to contain the GH signal sequence to promote secretion 
of the protein into the medium. The cDNAs can be cloned into commercially available 
vectors, e.g., pVL1392 from Invitrogen, Inc., and used to infect insect cells. The GH and GH 

25 muteins can be purified from conditioned media using conventional chromatography 

procedures. Antibodies to rhGH can be used in conjunction with Western blots to localize 
fractions containing the GH proteins during chromatography. Alternatively, fractions 
containing GH can be identified using ELIS A assays. 

The cysteine-added GH variants also can be expressed as intracellular or secreted 

30 proteins in eukaiyotic cells such as yeast, insect cells or mammalian cells. Vectors for 
expressing the proteins and methods for performing such experiments are described in 
catalogues from various commercial supply companies such as Invitrogen, Inc., Stratagene, 
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Inc. and ClonTech, Inc. The GH and GH muteins can be purified using conventional 
chromatography procedures. 

Biological activity of the GH muteins can be measured using a cell line that 
proliferates in response to GH. Fuh et al. (1992) created a GH-responsive cell line by stably 
5 transforming a myeloid leukemia cell line, FDC-P1 , with a chimeric receptor comprising the 
extracellular domain of the rabbit GH receptor fused to the mouse G-CSF receptor. This cell 
line proliferates in response to GH with a half maximal effective concentration (EC 50 ) of 20 
picomolar, A similar cell line can be constructed using the published sequences of these 
receptors and standard molecular biology techniques (Fuh et al., 1992). Alternatively, the 

10 extracellular domain of the human GH receptor can be fused to the mouse G-CSF receptor 
using the published sequences of these receptors and standard molecular biology techniques. 
Transformed cells expressing the chimeric receptor can be identified by flow cytometry using 
labeled GH, by the ability of transformed cells to bind radiolabeled GH, or by the ability of 
transformed cells to proliferate in response to added GH. Purified GH and GH muteins can 

15 be tested in cell proliferation assays using cells expressing the chimeric receptor to measure 
specific activities of the proteins. Cells can be plated in 96-well dishes with various 
concentrations of GH or GH muteins. After 1 8 h, cells are treated for 4 hours with 3 H- 
thymidine and harvested for determination of incorporated radioactivity. The EC 5 o can be 
determined for each mutein. Assays should be performed at least three times for each mutein 

20 using triplicate wells for each data point. GH muteins displaying similar optimal levels of 
stimulation and EC 5 o values comparable to or greater than wild type GH are preferable. 

GH muteins that retain in vitro activity can be PEGylated using a cysteine-reactive 8 
kDa PEG-maleimide (or PEG-vinylsulfone) commercially available from Shearwater, Inc. 
Generally, methods for PEGylating the proteins with these reagents will be similar to those 

25 described in world patent applications WO 9412219 and WO 9422466 and PCT application 
US95/06540, with minor modifications. The recombinant proteins must be partially reduced 
with dithiothreitol (DTT) in order to achieve optimal PEGylation of the free cysteine. 
Although the free cysteine is not involved in a disulfide bond, it is relatively unreactive to 
cysteine-reactive PEGs unless this partial reduction step is performed. The amount of DTT 

30 required to partially reduce each mutein can be determined empirically, using a range of DTT 
concentrations. Typically, a 5-10 fold molar excess of DTT for 30 min at room temperature 
is sufficient. Partial reduction can be detected by a slight shift in the elution profile of the 
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protein from a reversed-phase column. Care must be taken not to over-reduce the protein and 
expose additional cysteine residues. Over-reduction can be detected by reversed phase-HPLC 
(the protein will have a retention time similar to the fully reduced and denatured protein) and 
by the appearance of GH molecules containing two PEGs (detectable by a molecular weight 
5 change on SDS-PAGE) . Wild type GH can serve as a control since it should not PEGylate 
under similar conditions. Excess DTT can be removed by size exclusion chromatography 
using spin columns. The partially reduced protein can be reacted with various concentrations 
of PEG-maleimide (PEG: protein molar ratios of 1:1, 5:1,10:1 and 50:1) to determine the 
optimum ratio of the two reagents. PEGylation of the protein can be monitored by a 

10 molecular weight shift using sodium dodecyl sulfate polyacrylamide gel electrophoresis 

(SDS-PAGE). The lowest amount of PEG that gives significant quantities of mono-pegylated 
product without giving di-pegylated product will be considered optimum (80% conversion to 
mono-pegylated product is considered good). Generally, mono-PEGylated protein can be 
purified from non-PEGylated protein and unreacted PEG by size-exclusion or ion exchange 

15 chromatography. The purified PEGylated protein can be tested in the cell proliferation assay 
described above to determine its specific activity. 

The above experiments will allow identification of amino acids in the B-C loop, C-D 
loop or N-terminal end of the A-B loop in GH that can be changed to cysteine residues, 
PEGylated and retain in vitro biological activity. These muteins can be tested in animal 

20 disease models well known in the art. 

Experiments can be performed to confirm that the PEG molecule is attached to the 
protein at the proper site. This can be accomplished by proteolytic digestion of the protein, 
purification of the PEG peptide (which will have a large molecular weight) by size exclusion, 
ion exchange or reversed phase chromatography, followed by amino acid sequencing or mass 

25 spectroscopy. The PEG-coupled amino acid will appear as a blank in the amino acid 
sequencing run. 

The pharmacokinetic properties of the PEG-GH proteins can be determined as 
follows, or as described in world patent application W09422466. Pairs of rats or mice can 
receive an intravenous bolus injection of the test proteins. Circulating levels of the proteins 
30 are measured over the course of 24 h by removing a small sample of blood from the animals 
at desired time points. Circulating levels of the test proteins can be quantitated using ELISA 
assays. Additional experiments can be performed using the subcutaneous route to administer 
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the proteins. Similar experiments should be performed with the non-PEGylated protein to 
serve as a control. These experiments will reveal whether attachment of a PEG reagent to the 
protein alters its pharmacokinetic properties. Covalent modification of the protein with PEG 
should increase the protein's circulating half-life relative to the unPEGylated protein. Larger 
5 PEG molecules and/or attachment of multiple PEG molecules should lengthen the circulating 
half-life longer than smaller PEG molecules. 

PEG-GH proteins can be tested in rodent models of growth hormone deficiency (Cox 
et al., 1994) and cachexia (Tomas et al, 1992; Read et al., 1992) to determine optimum 
dosing schedules and demonstrate efficacy. These studies can explore different size PEG 

10 molecules, e.g., 8 and 20 kDa, and dosing schedules to determine the optimum PEG size and 
dosing schedule. It is expected that the larger PEG molecule will increase the circulating 
half-life greater than the smaller PEG molecule and will require less frequent dosing. 
However, large proteins potentially may have reduced volumes of distribution in vivo : thus, it 
is possible a 20 kDa PEG attached to GH will limit bioavailability, reducing its efficacy. 

15 Rodent models will allow determination of whether this is the case. Once the optimum 
dosing schedules and PEG sizes are determined, the efficacy of PEG-GH to GH can be 
compared in the animal models. While all PEG-GH proteins having GH activity are included 
in the invention, the preferred PEG-GH proteins are those that enhance growth equal or 
superior to GH, but which can be given less frequently. PEG-GH should be more efficacious 

20 than GH when both are administered using the less frequent dosing schedules. 

One GH deficiency model that can be used is a hypophysectomized rat. GH 
stimulates body weight gain and bone and cartilage growth in this model (Cox et al., 1994). 
Hypophysectomized rats can be purchased from Charles River. Rats can be injected with 
GH, PEG-GH or placebo and weight gain measured daily over a 10-14 day period. At time of 

25 sacrifice, tibial epiphysis width can be determined as a measure of bone growth. 

Experimental methods for performing these studies are described in Cox et al. (1994). 

The efficacy of PEG-GH in rodent cachexia models can be tested in a similar manner. 
Daily administration of dexamethasone, via osmotic pumps or subcutaneous injection, can be 
used to induce weight loss (Tomas et al., 1992; Read et al., 1992; PCT patent application 

30 US95/06540). 
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Example 2 
Cysteine-added variants of Erythropoietin 

This example relates to cysteine-added variants of erythropoietin (EPO). EPO is the 
hormone primarily responsible for stimulating erythropoiesis or red blood cell formation. 
5 EPO acts on immature red blood cell precursors to stimulate their further proliferation and 
differentiation into mature red blood cells. A commercial pharmaceutical version is available 
from Amgen, Inc. Human EPO is a 35-39 kDa glycoprotein secreted by the adult kidney. 
The mature human protein contains 166 amino acids and is heavily glycosylated. The 
sequence of human EPO (SEQ ID NO: 2) is shown in Lin et al 1985 and Jacobs et al. 1985, 

10 which are incorporated herein by reference. The primary sequence of EPO is highly 

conserved among species (greater than 80% identity; Wen et al., 1994). Sugar groups account 
for greater than 40% of the protein's mass. Human EPO contains three N-linked 
glycosylation sites and one O-linked glycosylation site. The N-linked glycosylation sites are 
conserved in different species whereas the O-linked glycosylation site is not. The extensive 

15 glycosylation of EPO has prevented the protein's crystallization, so the X-ray structure of the 
protein is not known. Human EPO contains four cysteine residues. The disulfide 
assignments are Cys7 to Cysl61 and Cys29 to Cys33. Cys33 is not conserved in mouse EPO, 
suggesting that the Cys29-Cys33 disulfide bond is not critical to mouse EPO's structure or 
function. This conclusion also seems to hold for human EPO (Boissel et al., 1993). 

20 The amino acid sequence of EPO is consistent with the protein being a member of the 

GH supergene family and mutational studies support this view of EPO's structure (Boissel et 
al., 1993; Wen et al., 1994). A model of the three dimensional structure of EPO, modeled 
after the GH structure has been proposed (Boissel et al., 1993; Wen et al., 1994). Amino 
acids in EPO important for receptor binding have been identified through mutagenesis 

25 experiments and reside primarily in the N-terminal half of presumptive helix A and the C- 
terminal half of presumptive helix D (Boissel et al., 1993; Wen et al., 1994; Matthews et al., 
1996). Only a single cell surface receptor for EPO has been identified (D* Andrea et al., 
1 989). It is believed that EPO dimerizes its receptor in much the same way that GH 
dimerizes its receptor (Matthew's et al., 1996). 

30 Human EPO contains three sites for N-linked glycosylation (asparagine- 24,- 38 and - 

83) and one site for O-linked glycosylation (serine-126). The N-linked glycosylation sites are 
located in the A-B and B-C loops and the O-glycosylation site is located in the C-D loop. 
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The N-linked glycosylation sites are conserved among species whereas the O-linked 
glycosylation site is absent in rodent EPO (Wen et al., 1993). A non-O-linked glycosylated 
human variant (Containing methionine at position 126 has been described (US patent 
4,703,008). The N-linked sugar groups are heavily branched and contain terminal sialic acid 
5 residues (Sasaki et al., 1987; Takeuchi et al., 1 988). N-38 and N-83 contain the most highly 
branched oligosaccharides (Sasaki et al., 1988). 

The terminal sialic residues on EPO are critical for the protein's in vivo function 
because removal of these residues by digestion eliminates in vivo activity (Fukada et al., 
1989; Spivak and Hogans, 1989). Loss of activity correlates with faster clearance of the 

10 asialated protein from the body. The circulating half-life of the asialated protein in rats is less 
than ten minutes, in contrast to that of the sialated protein, which is approximately 2 hr 
(Fukada et al., 1989; Spivak and Hogans, 1989). Thus, in vivo activity of EPO directly 
correlates with its circulating half-life. 

The role of N-linked sugars in EPO's biological activities haS been better defined by 

15 mutating, individually and in combination, the three asparagine residues comprising the N- 
linked glycosylation sites. EPO muteins in which only a single N-linked glycosylation site 
was mutated, i.e., N24Q, N38Q arid N83Q, were secreted from mammalian cells as efficiently 
as wild type EPO, indicating that N-linked glycosylation at all three sites is not required for 
protein secretion (N24Q indicates that the asparagine at position 24 is mutated to glutamine). 

20 In contrast, EPO muteins in which two or more N-linked glycosylation sites were mutated 
were secreted less efficiently than wild type EPO from mammalian cells (Yamaguchi et al., 
1991; Delorme et al., 1992). Mutagenesis studies found that each of the single N-linked 
glycosylation site muteins ha d in vitro biological activities equal to or greater than wild type 
EPO. Thus, it was concluded that none of the N-linked glycosylation sites is essential for 

25 secretion or in vitro biological activity of EPO. In fact, removal of one of the glycosylation 
sites seemed to improve biological activity (Yamaguchi et al., 1991). 

The in vivo biological activity of N-linked glycosylation muteins was studied by two 
groups. Yamaguchi et al. (1991) concluded that the N24Q and N83Q muteins had in vivo 
activities greater than wild type EPO, which correlated with their increased in vitro activities. 

30 These authors found that the N38Q mutein had decreased in vivo activity, about 60% of wild 
type EPO. N38 is the most heavily branched of the three N-linked glycosylation sites (Sasaki 
et al., 1 988). Delorme et al. (1 992) reported that mutating any of the N-linked glycosylation 
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sites reduced in vivo biological activity by about 50%. Muteins in which two or more 
glycosylation sites were mutated had decreased in vivo activities in both studies. 

The abo f ve studies indicate that some N-linked glycosylation is required for in vitro 
and in vivo activity of EPO. Individually, however, none of the three glycosylation sites is 
5 absolutely essential for activity. The N-linked sugars increase the apparent molecular weight 
of EPO and prolong its circulating half-life, which correlates with bioactivity. Natural EPO 
and EPO manufactured in mammalian cells have complex N-linked sugars containing 
galactose and terminal sialic acid residues. The galactose residues are recognized by specific 
receptors on hepatocytes and promote rapid clearance of EPO from the body unless the 

10 galactose residues are masked by the terminal sialic acid residues. 

Mutagenesis studies concluded that O-linked glycosylation is not required for in vitro 
or in vivo function of EPO (Delorme et al., 1992). This is in keeping with the observation 
that rodent EPO is not O-glycosylated and with the existence of a naturally occurring human 
EPO variant in which serine-126 is replaced by methionine, with a corresponding lack of 0- 

15 linked glycosylation. Mutagenesis of serine-126 revealed that certain amino acid changes at 
this site (to valine, histidine or glutamic acid) yielded EPO muteins with biological activities 
similar to wild type EPO, whereas other amino acid changes (to alanine or glycine) resulted in 
EPO molecules with severely reduced activities (Delorme et al., 1992). The effect of * 
changing serine-126 to cysteine was not studied. The in vivo bioactivity of SI 26V EPO was 

20 found to be similar to wild type EPO (Delorme et al., 1 992). 

The requirement for complex, N-linked carbohydrates containing terminal sialic acid 
residues for in vivo activity of EPO has limited commercial manufacture of the protein to 
mammalian cells. The important functions of the sialated N-linked sugars are to prevent 
protein aggregation, increase protein stability and prolong the circulating half-life of the 

25 protein. The terminal sialic acid residues prolong EPO's circulating half-life by masking the 
underlying galactose residues, which are recognized by specific receptors on hepatocytes and 
promote clearance of the asialated protein. EPO can be produced in insect cells and is N- 
glycosylated and fully active in vitro : its activity in vivo has not been reported (Wojchowski 
etal., 1987). 

30 This example provides for the design of cysteine-added EPO variants and their use in 

preparing conjugates using cysteine-reactive PEGs and other cysteine-reactive moieties. 
Certain amino acids in EPO are non-essential for biological activity and can be mutated to 
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cysteine residues without altering the normal disulfide binding pattern and overall 
conformation of the molecule. These amino acids are located in the A-B loop (amino acids 
23-58 of the mature protein sequence), the B-C loop (amino acids 77-89 of the mature protein 
sequence), the C-D loop (amino acids 108-131 of the mature protein sequence), proximal to 
5 helix A (amino acids 1-8) and distal to helix D (amino acids 153-166 of the mature protein 
sequence). Also contemplated as preferred sites for adding cysteine residues are at the N- 
terminus or C-terminus of the protein sequence. Preferred sites for cysteine substitutions are 
the 0- linked glycosylation site (serine- 126) and the amino acids comprising the three N- 
linked glycosylation sites (N24, 125, T26, N38, 139, T40, N83, S84, S85). Glycosylation sites 

10 are attractive sites for introducing cysteine substitutions and attaching PEG molecules to EPO 
because (1) these sites are surface exposed; (2) the natural protein can tolerate bulky sugar 
groups at these positions; (3) the glycosylation sites are located in the putative loop regions 
and away from the receptor binding site (Wen et al., 1994); and (4) mutagenesis studies 
indicate these sites (at least individually) are not essential for in vitro or in vivo activity 

15 (Yamaguchi et al., 1991; Delorme et al., 1992). As discussed above, the local conformation 
of the region encompassing the O-glycosylation site region seems to be important for 
biological activity. Whether a cysteine substitution at position 126 affects biological activity 
has not been studied. The cysteine-29 to cysteine-33 disulfide bond is not necessary for 
biological activity of EPO because changing both residues to tyrosine simultaneously yielded 

20 a biologically active EPO protein (Boissel et al., 1 993; Wen et al., 1 994). A "free" cysteine 
can be created by changing either cysteine-29 or cysteine-33 to another amino acid. Preferred 
amino acid changes would be to serine or alanine. The remaining "free" cysteine (cysteine-29 
or cysteine-33) would be a preferred site for covalently modifying the protein with cysteine- 
reactive moieties. 

25 Bill et al. (1995) individually substituted cysteine for N24, N38 and N83 and reported 

that the muteins had greatly reduced in vitro biological activities (less than 20% of wild type 
activity). Bill et al.(l 995) expressed the EPO variants as fusion proteins (fused to 
glutathionine-S-transferase) in bacteria. One aspect of the present invention is to provide 
expression systems in which the N24C, N38C and N83C EPO variants will have in vitro 

30 biological activities more similar to wild type EPO. 

US patent 4,703,008 contemplates naturally occurring variants of EPO as well as 
amino acid substitutions that are present in EPO proteins of mammals. Ovine EPO contains a 
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cysteine residue at position 88 of the polypeptide chain. The inventor is unaware of any other 
naturally occurring human or animal cysteine variants of EPO in which the cysteine residue 
occurs in the polypeptide regions disclosed herein as being useful for generating cysteine- 
added EPO variants. Patent 4,703,008 specifically teaches away from cysteine-added EPO 
5 variants by suggesting that expression of EPO might be improved by deleting cysteine 

residues or substituting naturally occurring cysteine residues with serine or histidine residues. 

The mature protein form of EPO can contain 165 or 166 amino acids because of post- 
radiational removal of the C-terminal arginine. Asp- 165 is the C-terminus of the 165 amino 
acid form and Agr-166 is the C-terminal amino acid of the 166 amino acid form. The 

10 cysteine substitution and insertion mutations described herein can comprise either the 165 or 
166 amino acid forms of mature EPO. 

A cDNA encoding EPO can be cloned using the polymerase chain reaction (PCR) 
technique from the human HepG2 or Hep3B cell lines, which are known to express EPO 
when treated with hypoxia or cobalt chloride (Wen et al., 1993) and are available from the 

15 American Type Culture Collection (ATCC). Cysteine mutations can be introduced into the 
cDNA by standard phage, plasmid or PCR mutagenesis procedures as described for GH. As 
described above, the preferred sites for introduction of cysteine substitution mutations are in 
the A-B loop, the B-C loop, the C-D loop and the region proximal to helix A and distal to 
helix D. The most preferred sites in these regions are the N- and O-linked glycosylation sites: 

20 S126C; N24C; I25C, T26C; N38C; I39C, T40C; N83C, S 84C and S85C. Other preferred 
sites for cysteine substitution mutagenesis are in the A-B loop, the B-C loop and the C-D 
loop, amino acids surrounding the glycosylation sites and the region of the protein proximal 
to helix A and distal to helix D (Boissel et al., 1993; Wen et al., 1994). Other preferred sites 
for cysteine substitutions in these regions are: Al, P2, P3, R4, D8, S9, T27, G28, A30, E31, 

25 H32, S34, N36, D43, T44, K45, N47, A50, K52, E55, G57, Q58, G77, Q78, A79, Q86, W88, 
E89,T107,R110,A111,G113,A114,Q115,K116,E117,A118,S120,P121,P122,D123, 
A124, A125, A127, A128, T132, K154, T157, G158, E159, A160, T163, G164, D165 and 
R166. Cysteine residues also can be introduced proximal to the first amino acid of the mature 
protein, i.e., proximal to Al, or distal to the final amino acid in the mature protein, i.e., distal 

30 to D 165 or R166. Other variants in which cys-29 or cys-33 have been replaced with other 
ammo acids, preferably serine or alanine, also are provided. 
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Wild type EPO and EPO muteins can be expressed using insect cells to determine 
whether the cysteine-added muteins are biologically active. DNAs encoding EPO/EPO 
muteins can be cloned into the Baculovirus expression vector pVL1392 (available from 
Invitrogen, Inc. and Sigma Corporation (St. Louis, MO) and used to infect insect cells. 
5 Recombinant Baculoviruses producing EPO can be identified by Western blots of infected 
insect cell conditioned media using polyclonal anti-human EPO antiserum (available from 
R&D Systems). The secreted EPO mutein proteins can be purified by conventional 
chromatographic procedures well known to those of skill in the art. Protein concentrations 
can be determined using commercially available protein assay kits or ELISA assay kits 

10 (available from R&D Systems and Bio-Rad Laboratories). 

Purified EPO and EPO muteins can be tested in cell proliferation assays using EPO- 
responsive cell lines such as UT7-epo (Wen et al, 1994) or TF1 (available from the ATCC) 
to measure specific activities of the proteins. Cells can be plated in 96-well microtiter plates 
with various concentrations of EPO. Assays should be performed in triplicate. After 1-3 

1 5 days in culture, cell proliferation can be measured by 3 H-thymidine incorporation, as 
described above for GH. The concentration of protein giving half-maximal stimulation 
(EC50) can be determined for each mutein. Assays should be performed at least three times 
for each mutein, with triplicate wells for each data point. EC 50 values can be used to 
compare the relative potencies of the muteins. Alternatively, cell proliferation in response to 

20 added EPO muteins can be analyzed using an MTT dye-exclusion assay (Komatsu et al. 

1 991). Proteins displaying similar optimal levels of stimulation and EC 50 values comparable 
to or greater than wild type EPO are preferable. 

The above studies confirm identification of amino acid residues in EPO that can be 
changed to cysteine residues and retain biological activity. Muteins that retain activity can be 

25 PEGylated using a cysteine-reactive 8 kDa PEG-maleimide as described above for GH 
muteins. Wild type EPO should be used as a control since it should not react with the 
cysteine-reactive PEG under identical partial reduction conditions. The lowest amount of 
PEG that gives significant quantities of mono-PEGylated product without giving di- 
PEGylated product should be considered optimum. Mono-PEGylated protein can be purified 

30 from non-PEGylated protein and unreacted PEG by size-exclusion or ion exchange 

chromatography. The purified PEGylated proteins should be tested in the cell proliferation 
assay described above to determine their bioactivities. 
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One or more of the PEGylated EPO muteins that retain in vitro bioactivity, are 
candidates for testing in animal disease models. PEGylation of the protein at the proper amino 
acid can be determined as described for GH. 

In vivo testing of PEGylated EPO muteins expressed using insect cells may require 
that they be re-engineered for expression in mammalian cells to ensure proper glycosylation. 
PEG-EPO candidates produced using insect cells can be tested in the animal models 
described below to determine if they are active in vivo and whether they are as active as PEG- 
EPO produced using mammalian cell expression systems. For expression in mammalian cells, 
the EPO muteins can be subcloned into commercially available eukaryotic expression vectors 
and used to stably transform Chinese Hamster Ovary (CHO) cells (available from the ATCC). 
Sublines can be screened for EPO expression using ELISA assays. Sufficient quantities of 
the insect cell- and mammalian cell-produced EPO muteins can be prepared to compare their 
biological activities in animal anemia models. 

In vivo bioactivities of the EPO muteins can be tested using the artificial 
polycythemia or starved rodent models (Cotes and Bangham., 1961 ; Goldwasser and Gross., 
1975). In the starved rodent model, rats are deprived of food on day one and treated with test 
samples on days two and three. On day four, rats receive an injection of radioactive iron-59. 
Approximately 1 8h later, rats are anesthetized and blood samples drawn. The percent 
conversion of labeled iron into red blood cells is then determined. In the artificial 
polycythemia model, mice are maintained in a closed tank and exposed for several days to 
hypobaric air. The animals are then brought to normal air pressure. Red blood cell formation 
is suppressed for several days. On day four or six after return to normal air pressure, mice are 
injected with erythropoietin or saline. Mice receive one injection per day for one to two days. 
One day later the animals receive an intravenous injection of labeled iron-59. The mice are 
euthanized 20 h later and the amount of labeled iron incorporated into red blood cells 
determined. EPO stimulates red blood cell formation in both models as measured by a dose- 
dependent increase in labeled iron incorporated into red blood cells. In both models different 
dosing regimens and different times of injections can be studied to determine if PEG-EPO is 
biologically active and/or more potent and produces longer acting effects than natural EPO. 

Example 3 
Alpha Interferon 
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Alpha interferon is produced by leukocytes and has antiviral, anti-tumor and 
immunomodulatory effects. There are at least 20 distinct alpha interferon genes that encode 
proteins that share 70% or greater amino acid identity. Amino acid sequences of the known 
alpha interferon species are given in Blatt et al. (1996). A "consensus" interferon that 
5 incorporates the most common amino acids into a single polypeptide chain has been 

described (Blatt et al., 1996). A hybrid alpha interferon protein may be produced by splicing 
different parts of alpha interferon proteins into a single protein (Horisberger and Di Marco, 
1995). Some alpha interferons contain N-linked glycosylation sites in the region proximal to 
helix A and near the B-C loop (Blatt et al. 1966). The alpha 2 interferon protein (SEQ ID NO: 

10 3) contains four cysteine residues that form two disulfide bonds. The cysl-cys98 disulfide 
bond (cysl-cys99 in some alpha interferon species such as alpha 1; SEQ ID NO: 4) is not 
essential for activity. The alpha 2-interferon protein does not contain any N-linked 
glycosylation sites. The crystal structure of alpha interferon has been determined 
(Radhakrishnan et al., 1996). 

15 This example provides cysteine added variants in the region proximal to the A helix, 

distal to the E helix, in the A-B loop, in the B-C loop, in the C-D loop and in the D-E loop. 
Preferred sites for the introduction of cysteine residues in these regions of the alpha 
interferon-2 species are: D2, L3, P4, Q5, T6, S8, Q20, R22, K23, S25, F27, S28, K31, D32, 
R33, D35, G37, F38, Q40, E41, E42, F43, G44, N45, Q46, F47, Q48, K49, A50, N65, S68, 

20 T69, K70, D71, S72, S73, A74, A75, D77, E78, T79, Y89, Q90, Q91 , N93, D94, E96, A97, 
Q101, G102, G104, T106, E107, T108, P109, Kl 12, E113, Dl 14, SI 15, K131, E132, K133, 
K134, Y135, S136, A139, S152, S154, T155, N156, L157, Q158, E159, S160, L161, R162, 
S163, K164, E165. Variants in which cysteine residues are introduced proximal to the first 
amino acid of the mature protein, i.e., proximal to CI, or distal to the final amino acid in the 

25 mature protein, i.e., distal to El 65 are provided. Other variants in which cys-1 or cys-98 (cys- 
99 in some alpha interferon species) have been replaced with other amino acids, preferably 
serine or alanine, also are provided. Other variants in which Cys-1 has been deleted (des Cys- 
1) also are provided. The cysteine variants may be in the context of any naturally occurring 
or non-natural alpha interferon sequence, e.g., consensus interferon or interferon protein 

30 hybrids. Some naturally occurring alpha interferon species (e.g., alpha interferon-1) contain a 
naturally occurring "free" cysteine. In such interferon species the naturally occurring free 
cysteine can be changed to another amino acid, preferably serine or alanine. 
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This example also provides cysteine variants of other alpha interferon species, 
including consensus interferon, at equivalent sites in these proteins. The alignment of the 
alpha interferon-2 species with other known alpha interferon species and consensus interferon 
is given in Blatt et al. (1996). The crystal structure of alpha interferon -2 has been determined 
5 by Rhadhakrishnan et al. (1996). Lydon et al (1985) found that deletion of the first four 
amino acids from the N-tenninus of alpha interferon did not affect biological activity. 
Valenzuela et al (1985) found that substitution of Phe-47 in alpha interferon-2 with Cys, Tyr 
or Ser did not alter biological activity of the protein. Cys-1 and Cys-98 have been changed 
individually to glycine and serine, respectively, without altering biological activity of the 

1 0 protein (DeChiara et al , 1 986). 

DNA sequences encoding alpha interferon-2 can be amplified from human genomic 
DNA, since alpha interferon genes do not contain introns (Pestka et al., 1987). The DNA 
sequence of alpha interferon-2 is given in Goeddel et al. (1980). Alternatively, a cDNA for 
alpha interferon-2 can be isolated from human lymphoblastoid cell lines that are known to 

15 express alpha interferon spontaneously or after exposure to viruses (Goeddell et al., 1980; 
Pickering et al., 1980). Many of these cell lines are available from the American Type 
Culture Collection (Rockville, MD). Specific mutations can be introduced into the alpha 
interferon sequence using plasmid-based site-directed mutagenesis kits (e.g., Quick-Change 
Mutagenesis Kit, Stratagene, Inc.), phage mutagenesis strategies or employing PCR 

20 mutagenesis as described for GH. 

Alpha interferon has been successfully produced in E. coli as an intracellular protein 
(Tarnowski et al., 1986; Thatcher and Panayotatos, 1986). Similar procedures can be used to 
express alpha interferon muteins. Plasmids encoding alpha interferon or alpha interferon 
muteins can be cloned into an E. coli expression vector such as pET15b (Novagene, Inc.) that 

25 uses the strong T7 promoter or pCYBl (New England BioLabs, Beverly, MA) that uses the 
TAC promoter. Expression of the protein can be induced by adding IPTG to the growth 
media. 

Recombinant alpha interferon expressed in E. coli is sometimes soluble and 
sometimes insoluble (Tarnowski et al., 1986; Thatcher and Panayotatos, 1986). Insolubility 
30 appears to be related to the degree of overexpression of the protein. Insoluble alpha interferon 
proteins can be recovered as inclusion bodies and renatured to a fully active conformation 
following standard oxidative refolding protocols (Thatcher and Panayotatos, 1986; Cox et al., 
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1994). The alpha interferon proteins can be purified further using other chromatographic 
methods such as ion-exchange, hydrophobic interaction, size-exclusion and reversed phase 
resins (Thatcher and Panayotaos, 1986). Protein concentrations can be determined using 
commercially available protein assay kits (Bio-Rad Laboratories). 
5 If E. coli expression of alpha interferon muteins is not successful, one can express the 

proteins in insect cells as secreted proteins as described for GH. The proteins can be 
modified to contain the natural alpha interferon signal sequence (Goeddell et al., 1980) or the 
honeybee mellitin signal sequence (Invitrogen, Inc.) to promote secretion of the proteins. 
Alpha interferon and alpha interferon muteins can be purified from conditioned media using 

10 conventional chromatography procedures. Antibodies to alpha interferon can be used in 
conjunction with Western blots to localize fractions containing the alpha interferon proteins 
during chromatography. Alternatively, fractions containing alpha interferon proteins can be 
identified using ELISAs. 

Bioactivities of alpha interferon and alpha interferon muteins can be measured using 

15 an in vitro viral plaque reduction assay (Ozes et al., 1992; Lewis, 1995). Human HeLa cells 
can be plated in 96-well plates and grown to near confluency at 37°C. The cells are then 
washed and treated for 24 hour with different concentrations of each alpha interferon 
preparation. Controls should include no alpha interferon and wild type alpha interferon 
(commercially available from Endogen, Inc., Woburn, MA). A virus such as Vesicular 

20 stomatitis virus (VSV) or encephalomyocarditis virus (EMCV) is added to the plates and the 
plates incubated for a further 24-48 hours at 37°C. Additional controls should include 
samples without virus. When 90% or more of the cells have been killed in the virus-treated, 
no alpha interferon control wells (determined by visual inspection of the wells), the cell 
monolayer are stained with crystal violet and absorbance of the wells read using a microplate 

25 reader. Alternatively, the cell monolayers can be stained with the dye MTT (Lewis, 1995). 
Samples should be analyzed in duplicate or triplicate. EC 50 values (the amount of protein 
required to inhibit the cytopathic effect of the virus by 50%) can be used to compare the 
relative potencies of the proteins. Wild type alpha interferon-2 protects cells from the 
cytopathic effects of VSV and EMCV and has a specific activity of approximately 2x1 0 8 

30 units/mg in this assay (Ozes et al., 1992). Alpha interferon muteins displaying EC 50 values 
comparable to wild type Alpha Interferon are preferable. 
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Alpha interferon muteins that retain activity can be PEGylated using procedures 
similar to those described for GH. Wild type alpha interferon-2 can serve as a control since it 
should not PEQylate under similar conditions. The lowest amount of PEG that gives 
significant quantities of mono-pegylated product without giving di-pegylated product should 
5 be considered optimum. Mono-PEGylated protein can be purified from non-PEGylated 
protein and unreacted PEG by size-exclusion or ion exchange chromatography. The purified 
PEGylated proteins can be tested in the viral plaque reduction bioassay described above to 
determine their bioactivities. PEGylated alpha interferon proteins with bioactivities 
comparable to wild type alpha interferon are preferable. Mapping the PEG attachment site 
10 and determination of pharmacokinetic data for the PEGylated protein can be performed as 
described for GH. 

In vivo bioactivities of the PEG-alpha interferon muteins can be tested using tumor 
xenograft models in nude mice and viral infection models (Balkwill, 1986; Fish et al., 1986). 
Since PEG-alpha interferon bioactivity may be species-specific, one should confirm activity 

15 of the PEGylated protein using appropriate animal cell lines in in vitro virus plaque reduction 
assays, similar to those described above. Next, one should explore the effects of different 
dosing regimens and different times of injections to determine if PEG-alpha interferon is 
more potent and produces longer lasting effects than non-PEGylated alpha interferon. 

The novel alpha interferon-derived molecules of this example can be formulated and 

20 tested for activity essentially as set forth in Examples 1 and 2, substituting, however, the 

appropriate assays and other considerations known in the art related to the specific proteins of 
this example. 

Example 4 

25 Beta Interferon 

Beta interferon is produced by fibroblasts and exhibits antiviral, antitumor and 
immunomodulatory effects. The single-copy beta interferon gene encodes a preprotein that is 
cleaved to yield a mature protein of 166 amino acids (Taniguchi et al. 1980; SEQ ID NO: 5). 
The protein contains three cysteines, one of which, cysteine-17, is "free", i.e., it does not 
30 participate in a disulfide bond. The protein contains one N-linked glycosylation site. The 
crystal structure of the protein has been determined (Karpusas et al. 1997) 
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This example provides cysteine-added variants at any of the three amino acids that 
comprise the N-linked glycosylation sites, i.e., N80C, E81C or T82C This example also 
provides cysteijie-added variants in the region proximal to the A helix, distal to the E helix, in 
the A-B loop, in the B-C loop, in the C-D loop and in the D-E loop. Preferred sites for 
5 introduction of cysteine residues in these regions are: Ml , S2, Y3, N4, L5, Q23, N25, G26, 
R27, E29, Y30, K33, D34, R35, N37, D39, E42, E43, K45, Q46, 147, Q48, Q49, Q51, K52, 
E53, A68, F70, R71, Q72, D73, S74, S75, S76, T77, G78, E107, K108, El 09, Dl 10, Fl 1 1, 
T112, R113, G114, K115, LI 16, A135, K136, E137, K138, S139, 1157, N158, R159, L160, 
T161, G162, Y163, L164, R165 and N166. Variants in which cysteine residues are introduced 

10 proximal to the first amino acid of the mature protein, i.e., proximal to Ml , or distal to the 
final amino acid in the mature protein, i.e., distal to N166 also are provided. 

These variants are produced in the context of the natural protein sequence or a variant 
protein in which the naturally occurring "free" cysteine residue (cysteine-17) has been 
changed to another amino acid, preferably serine or alanine. 

15 The novel beta interferon-derived molecules of this example can be formulated and 

tested for activity essentially as set forth in Examples 1 and 2, substituting, however, the 
appropriate assays and other considerations known in the art related to the specific proteins of 
this example. 

20 Example 5 

Granulocyte Colony-Stimulating Factor (G-CSF) 
G-CSF is a pleuripotent cytokine that stimulates the proliferation, differentiation and 
function of granulocytes. The protein is produced by activated monocytes and macrophages. 
The amino acid sequence of G-CSF (SEQ ID NO: 6) is given in Souza et at (1986), Nagata et 

25 al. (1986a, b) and US patent 4,810,643 all incorporated herein by reference. The human 
protein is synthesized as a preprotein of 204 or 207 amino acids that is cleaved to yield 
mature proteins of 174 or 177 amino acids. The larger form has lower specific activity than 
the smaller form. The protein contains five cysteines, four of which are involved in disulfide 
bonds. Cysteine-17 is not involved in a disulfide bond. Substitution of cysteine-17 with 

30 serine yields a mutant G-CSF protein that is fully active (US patent 4,810,643). The protein 
is O-glycosylated at threonine- 133 of the mature protein. 
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This example provides a cysteine-added variant at threonine-133. This example 
provides other cysteine-added variants in the region proximal to helix A, distal to helix D, in 
the A-B loop, B-C loop and C-D loop. Preferred sites for introduction of cysteine 
substitutions in these regions are: Tl, P2, L3, G4, P5, A6, S7, S8, L9, P10, Ql 1, S12, T38, 
5 K40, S53, G55, W58, A59, P60, S62, S63, P65, S66, Q67, A68, Q70, A72, Q90, A91 , E93, 
G94, S96, E98, G100, G125, M126, A127, A129, Q131, T133, Q134, G135, A136, A139, 
A141, S142, A143, Q145, Q173 and P174. Variants in which cysteine residues are introduced 
proximal to the first amino acid of the mature protein, i.e., proximal to Tl, or distal to the 
final amino acid in the mature protein, i.e., distal to PI 74 are provided. These variants are 

10 provided in the context of the natural protein sequence or a variant protein in which the 
naturally occurring "free" cysteine residue (cysteine-17) has been changed to another amino 
acid, preferably serine or alanine. 

A cDNA encoding human G-CSF can be purchased from R&D Systems (Minneapolis, 
MN) or amplified using PCR from mRNA isolated from human carcinoma cell lines such as 

15 5637 and U87-MG known to express G-CSF constitutively (Park et al., 1989; Nagata, 1994). 
These cell lines are available from the American Type Culture collection (Rockville, MD). 
Specific mutations can be introduced into the G-CSF sequence using plasmid-based site- 
directed mutagenesis kits (e.g., Quick-Change Mutagenesis Kit, Stratagene, Inc.), phage 
mutagenesis methods or employing PCR mutagenesis as described for GH. 

20 G-CSF has been successfully produced in E. coli as an intracellular protein ( Souza et 

al., 1986). One can employ similar procedures to express G-CSF and G-CSF muteins. 
Plasmids encoding G-CSF or G-CSF muteins can be cloned into an E. coli expression vector 
such as pET15b (available from Novagen, Inc., Madison, WI) that uses the strong T7 
promoter or pCYBl (available from New England BioLabs, Beverly, MA) that uses the 

25 strong TAC promoter. Expression of the protein can be induced by adding IPTG to the 

growth media. Recombinant G-CSF expressed in E. coli is insoluble and can be recovered as 
inclusion bodies. The protein can be renatured to a fully active conformation following 
standard oxidative refolding protocols (Souza et al., 1986; Lu et al., 1992; Cox et al., 1994). 
Similar procedures can be used to refold cysteine muteins. The proteins can be purified 

30 further using other chromatographic methods such as ion exchange, hydrophobic interaction, 
size-exclusion and reversed phase resins (Souza et al., 1986; Kuga et al., 1989; Lu et al, 
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1992). Protein concentrations can be determined using commercially available protein assay 
kits (Bio-Rad Laboratories). 

If Rcoli expression of G-CSF or G-CSF muteins is not successful, one can express 
G-CSF and G-CSF muteins in insect cells as secreted proteins as described for GH. The 
5 proteins can be modified to contain the natural G-CSF signal sequence (Souza et al. 5 1986; 
Nagata et al., 1986a; Nagata et al, 1986b) or the honeybee mellitin signal sequence 
(Invitrogen, Inc., Carlsbad, CA) to promote secretion of the proteins. G-CSF and G-CSF 
muteins can be purified from conditioned media using conventional chromatography 
procedures. Antibodies to G-CSF can be used in conjunction with Western blots to localize 

10 fractions containing the G-CSF proteins during chromatography. Alternatively, fractions 
containing G-CSF proteins can be identified using ELISAs. 

G-CSF muteins also can be expressed in mammalian cells as described for 
erythropoietin in Example 2. 

Bioactivities of G-CSF and the G-CSF muteins can be measured using an in vitro cell 

1 5 proliferation assay. The mouse NFS-60 cell line and the human AML- 1 93 cell line can be 
used to measure G-CSF bioactivity (Tsuchiya et al., 1986; Lange et al., 1987; Shirafuji et al., 
1989). Both cell lines proliferate in response to human G-CSF. The AML-193 cell line is 
preferable since it is of human origin, which eliminates the possibility of a false conclusion 
resulting from species differences. The NFS60 cell lines proliferates in response to G-CSF 

20 with a half-maximal effective concentration (EC 50 ) of 1 0-20 picomolar. Purified G-CSF and 
G-CSF muteins can be tested in cell proliferation assays using these cell lines to determine 
specific activities of the proteins, using published methods (Tsuchiya et al., 1986; Lange et 
al., 1987; Shirafuji et al., 1989). Cells can be plated in 96-well tissue culture dishes with 
different concentrations of G-CSF or G-CSF muteins. After 1-3 days at 37°C in a humidified 

25 tissue culture incubator, proliferation can be measured by 3 H-thymidine incorporation as 
described for GH. Assays should be performed at least three times for each mutein using 
triplicate wells for each data point. EC 50 values can be used to compare the relative 
potencies of the muteins. G-CSF muteins displaying similar optimal levels of stimulation and 
EC 50 values comparable to wild type G-CSF are preferable. 

30 G-CSF muteins that retain activity can be PEGylated using procedures similar to those 

described for GH. Wild type G-CSF and ser-17 G-CSF can serve as controls since they 
should not PEGylate under similar conditions. The lowest amount of PEG that gives 



44 



WO 99/03887 



PCT/US98/14497 



significant quantities of mono-PEGylated product without giving di-PEGylated product 
should be considered optimum, Mono-PEGylated protein can be purified from non- 
PEGylated protein and unreacted PEG by size-exclusion or ion exchange chromatography. 
The purified PEGylated proteins can be tested in the cell proliferation assay described above 
5 to determine their bioactivities. 

The PEG site in the protein can be mapped using procedures similar to those 
described for GH. Pharmacokinetic data for the PEGylated proteins can be obtained using 
procedures similar to those described for GH. 

Initial studies to demonstrate in vivo efficacy of PEG-G-CSF can be done in normal 

10 Sprague-Dawley rats, which can be purchased from Charles River. Groups of rats should 
receive single subcutaneous or intravenous injections of various doses of G-CSF, PEG-G- 
CSF or placebo. Animals should be sacrificed at daily intervals for up to a week for 
determination of neutrophil and total white blood cell counts in peripheral blood. Other blood 
cell types (platelets and red blood cells) can be measured to demonstrate cell specificity. 

15 The efficacy of PEG-G-CSF can be tested in a rat neutropenia model. Neutropenia 

can be induced by treatment with cyclophosphamide, which is a commonly used 
chemotherapeutic agent that is myelosuppressive. G-CSF accelerates recovery of normal 
neutrophil levels in cyclophosphamide-treated animals (Kubota et al., 1990). Rats receive an 
injection of cyclophosphamide on day 0 to induce neutropenia. The animals are then be 

20 divided into different groups, which will receive subcutaneous injections of G-CSF, PEG-G- 
CSF or placebo. Neutrophil and total white blood cell counts in peripheral blood should be 
measured daily until they return to normal levels. Initially one should confirm that G-CSF 
accelerates recovery from neutropenia when injected daily. Next, one should explore the 
effects of different dosing regimens and different times of injections to determine if PEG-G- 

25 CSF is more potent and produces longer lasting effects than non-PEGylated G-CSF. 

The novel GCSF-derived molecules of this example can be formulated and tested for 
activity essentially as set forth in Examples 1 and 2, substituting, however, the appropriate 
assays and other considerations known in the art related to the specific proteins of this 
example. 

30 

Example 6 
Thrombopoietin (TPO) 
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Thrombopoietin stimulates the development of megakaryocyte precursors of platelets. 
The amino acid sequence of TPO (SEQ ID NO: 7) is given in Bartley et al.(1994), Foster et 
al.(1994), de Sauvage et al.(1994), each incorporated herein by reference. 

The protein is synthesized as a 353 amino acid precursor protein that is cleaved to 
5 yield a mature protein of 332 amino acids. The N-terminal 154 amino acids have homology 
with EPO and other members of the GH supergene family. The C-terminal 199 amino acids 
do not share homology with any other known proteins. The C-terminal region contains six N- 
linked glycosylation sites and multiple O-linked glycosylation sites (Hoffman et al., 1996). 
O-linked glycosylation sites also are found in the region proximal to the A helix, in the A-B 
1 0 loop, and at the C-terminus of Helix C (Hoffinan et al., 1 996). A truncated TPO protein 
containing only residues 1-195 of the mature protein is fully active in vitro (Bartley et al., 
1994). 

This example provides cysteine-added variants at any of the amino acids that comprise 
the N-linked glycosylation sites and the O-linked glycosylation sites. This example also 

1 5 provides cysteine-added variants in the region proximal to the A helix, distal to the D helix, in 
the A-B loop, in the B-C loop, and in the C-D loop. 

Preferred sites for introduction of cysteine residues are: SI, P2, A3, P4, P5, A6, T37, 
A43, D45, S47, G49, E50, K52, T53, Q54, E56, E57, T58, A76, A77, R78, G79, Q80, G82, 
T84, S87, S88, G109, Tl 10, Ql 1 1 , PI 13, PI 14, Ql 15, Gl 16, Rl 17, Tl 1 8, Tl 19, A120, 

20 H121, K122, G146, G147, S148, T149, A155, T158, T159, A160, S163, T165, S166, T170, 
N176, R177, T178, S179, G180, E183, T184, N185, F186, T187, A188, S189, A190, T192, 
T193, G194, S195, N213, Q214, T215, S216, S218, N234, G235, T236, S244, T247, S254, 
S255, T257, S258, T260, S262, S272, S274, T276, T280, T291, T294, S307, T310, T312, 
T314, S315, N319, T320, S321, T323, S325, Q326, N327, L328, S329, Q330, E331 and 

25 G332. Variants in which cysteine residues are introduced proximal to the first amino acid of 
the mature protein, i.e., proximal to SI, or distal to the final amino acid in the mature protein, 
i.e., distal to G332 are provided. The cysteine-added variants are provided in the context of 
the natural human protein or a variant protein that is truncated between amino acids 147 and 
the C-terminus of the natural protein, G332. Variants in which cysteine residues are added 

30 distal to the final amino acid of a TPO protein that is truncated between amino acids 147 and 
332 also are provided. 
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The novel TPO-derived molecules of this example can be formulated and tested for 
activity essentially as set forth in Examples 1 and 2, substituting, however, the appropriate 
assays and other considerations known in the art related to the specific proteins of this 
example. 

Example 7 

Granulocyte-Macrophage Colony Stimulating Factor (GM-CSF) 

GM-CSF stimulates the proliferation and differentiation of various hematopoietic 
cells, including neutrophil, monocyte, eosinophil, erythroid, and megakaryocyte cell lineages. 
The amino acid sequence of human GM-CSF (SEQ ID NO: 8) is given in Cantrell et al. 
(1985) and Lee et al (1985) both incorporated herein by reference. 

GM-CSF is produced as a 144 amino acid preprotein that is cleaved to yield a mature 
127 amino acid protein. The mature protein has two sites for N-linked glycosylation. One site 
is located at the C-terminal end of Helix A; the second site is in the A-B loop. 

This example provides cysteine-added variants at any of the amino acids that comprise 
the N-linked glycosylation sites, i.e., N27C, L28C, S29C, N37C, E38C and T39C. This 
example also provides cysteine-added variants in the region proximal to the A helix, distal to 
the D helix, in the A-B loop, in the B-C loop, and in the C-D loop. Preferred sites for 
introduction of cysteine substitutions in these regions are: Al, P2, A3, R4, S5, P6, S7, P8, S9, 
T10, Qll, R30, D31, T32, A33, A34, E35, E41, S44, E45, D48, Q50, E51, T53, Q64, G65, 
R67, G68, S69, L70, T71, K72, K74, G75, T91, E93, T94, S95, A97, T98, T102, II 17, D120, 
E123, V125, Q126 and E127. Variants in which cysteine residues are introduced proximal to 
the first amino acid of the mature protein, i.e., proximal to Al, or distal to the final amino 
acid in the mature protein, i.e., distal to E127 are provided. 

The novel GM-CSF-derived molecules of this example can be formulated and tested 
for activity essentially as set forth in Examples 1 and 2, substituting, however, the appropriate 
assays and other considerations known in the art related to the specific proteins of this 
example. 
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Example 8 
IL-2 

IL-2 is a T cell growth factor that is synthesized by activated T cells. The protein 
stimulates clonal expansion of activated T cells. Human IL-2 is synthesized as a 153 amino 
acid precursor that is cleaved to yield a 133 amino acid mature protein (Tadatsugu et al., 
1983; Devos et al., 1983; SEQ ID NO: 9). 

The amino acid sequence of IL-2 is set forth in (Tadatsugu et al. 1983; Devos et al. 

1983) . The mature protein contains three cysteine residues, two of which form a disulfide 
bond. Cysteine-125 of the mature protein is not involved in a disulfide bond. Replacement 
of cysteine-125 with serine yields an IL-2 mutein with full biological activity (Wang et al., 

1984) . The protein is O-glycosylated at threonine-3 of the mature protein chain. 

This example provides cysteine-added variants in the last four positions of the D 
helix, in the region distal to the D helix, in the A-B loop, in the B-C loop, and in the C-D 
loop. These variants are provided in the context of the natural protein sequence or a variant 
protein in which the naturally occurring "free" cysteine residue (cysteine-125) has been 
changed to another amino acid, preferably serine or alanine. Variants in which cysteine 
residues are introduced proximal to the first amino acid, i.e., Al, or distal to the final amino 
acid, i.e., Tl 3 3, of the mature protein, also are provided. 

The novel IL-2-derived molecules of this example can be formulated and tested for 
activity essentially as set forth in Examples 1 and 2, substituting, however, the appropriate 
assays and other considerations known in the art related to the specific proteins of this 
example. 

Example 9 
IL-3 

IL-3 is produced by activated T cells and stimulates the proliferation and 
differentiation of pleuripotent hematopoietic stem cells. The amino acid sequence of human 
IL-3 (SEQ ID NO: 10) is given in Yang et al. (1986); Dorssers et al. (1987) and Otsuka et al. 
(1988) all incorporated herein by reference. The protein contains two cysteine residues and 
two N-linked glycosylation sites. Two alleles have been described, resulting in isoforms 
having serine or proline at amino acid position 8 or the mature protein. 
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This example provides cysteine-added variants at any of the amino acids that comprise 
the N-linked glycosylation sites. This example also provides cysteine-added variants in the 
region proximal to the A helix, distal to the D helix, in the A-B loop, in the B-C loop, and in 
the C-D loop. Variants in which cysteine residues are introduced proximal to the first amino 
acid or distal to the final amino acid in the mature protein, also are provided. 

The novel IL-3-derived molecules of this example can be formulated and tested for 
activity essentially as set forth in Examples 1 and 2, substituting, however, the appropriate 
assays and other considerations known in the art related to the specific proteins of this 
example. 

Example 10 
IL-4 

IL-4 is a pleiotropic cytokine that stimulates the proliferation and differentiation of 
monocytes and T and B cells. IL-4 has been implicated in the process that leads to B cells 
secreting IgE, which is believed to play a role in asthma and atopy. The bioactivity of IL-4 is 
species specific. IL-4 is synthesized as a 153 amino acid precursor protein that is cleaved to 
yield a mature protein of 129 amino acids. The ammo acid sequence of human IL-4 (SEQ ID 
NO:l 1) is given in Yokota et al. (1986) which is incorporated herein by reference. The 
protein contains six cysteine residues and two N-linked glycosylation sites. The glycosylation 
sites are located in the A-B and C-D loops. 

This example provides cysteine-added variants at any of the amino acids that comprise 
the N-linked glycosylation sites. This example also provides cysteine-added variants in the 
region proximal to the A helix, distal to the D helix, in the A-B loop, in the B-C loop, and in 
the C-D loop. Variants in which cysteine residues are introduced proximal to the first amino 
acid, HI, or distal to the final amino acid, S129, of the mature protein are provided. 

The novel IL-4-derived molecules of this example can be formulated and tested for 
activity essentially as set forth in Examples 1 and 2, substituting, however, the appropriate 
assays and other considerations known in the art related to the specific proteins of this 
example. 

Example 11 
IL-5 
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IL-5 is a differentiation and activation factor for eosinophils. The amino acid 
sequence of human IL-5 (SEQ ID NO: 12) is given in Yokota et al. (1987) which is 
incorporated herein by reference. The mature protein contains 1 15 amino acids and exists in 
solution as a disulfide-linked homodimer. The protein contains both O-linked and N-linked 
glycosylation sites. 

This example provides cysteine-added variants in the region proximal to the A helix, 
distal to the D helix, in the A-B loop, in the B-C loop, and in the C-D loop. Variants in which 
cysteine residues are added proximal to the first amino acid or distal to the final amino acid in 
the mature protein are provided. 

The novel IL-5-derived molecules of this example can be formulated and tested for 
activity essentially as set forth in Examples 1 and 2, substituting, however, the appropriate 
assays and other considerations known in the art related to the specific proteins of this 
example. 

Example 12 
IL-6 

IL-6 stimulates the proliferation and differentiation of many cell types. The amino 
acid sequence of human IL-6 (SEQ ID NO: 13) is given in Hirano et al. (1 986) which is 
incorporated herein by reference. IL-6 is synthesized as a 212 amino acid preprotein that is 
cleaved to generate a 1 84 amino acid mature protein. The mature protein contains two sites 
for N-linked glycosylation and one site for O-glycosylation, at T137, T138, T142 or T143. 

This example provides cysteine-added variants at any of the amino acids that comprise 
the N-linked glycosylation sites and at the O-linked glycosylation site. Also provided are 
cysteine-added variants in the region proximal to the A helix, distal to the D helix, in the A-B 
loop, in the B-C loop, and in the C-D loop. Variants in which cysteine residues are added 
proximal to the first amino acid or distal to the final amino acid of the mature protein also are 
provided. 

The novel IL-6-derived molecules of this example can be formulated and tested for 
activity essentially as set forth in Examples 1 and 2, substituting, however, the appropriate 
assays and other considerations known in the art related to the specific proteins of this 
example. 
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Example 13 
IL-7 

IL-7 stimulates proliferation of immature B cells and acts on mature T cells. The 
amino acid sequence of human IL-7 (SEQ ID NO: 14) is given in Goodwin et al. (1989) 
which is incorporated herein by reference. The protein is synthesized as a 177 amino acid 
preprotein that is cleaved to yield a 152 amino acid mature protein that contains three sites for 
N-linked glycosylation. 

This example provides cysteine-added variants at any of the amino acids that comprise 
the N-linked glycosylation sites. This example also provides cysteine-added variants in the 
region proximal to the A helix, distal to the D helix, in the A-B loop, in the B-C loop, and in 
the C-D loop. 

The novel IL-7-derived molecules of this example can be formulated and tested for 
activity essentially as set forth in Examples 1 and 2, substituting, however, the appropriate 
assays and other considerations known in the art related to the specific proteins of this 
example. Variants in which cysteine residues are added proximal to the first amino acid or 
distal to the final amino acid of the mature protein also are provided. 



Example 14 
IL-9 

IL-9 is a pleiotropic cytokine that acts on many cell types in the lymphoid, myeloid 
and mast cell lineages. IL-9 stimulates the proliferation of activated T cells and cytotoxic T 
lymphocytes, stimulates proliferation of mast cell precursors and synergizes with 
erythropoietin in stimulating immature red blood cell precursors. The amino acid sequence of 
human IL-9 (SEQ ID NO: 15) is given in Yang et al. (1989) which is incorporated herein by 
reference. IL-9 is synthesized as a precursor protein of 144 amino acids that is cleaved to 
yield a mature protein of 126 amino acids. The protein contains four potential N-linked 
glycosylation sites. 

This example provides cysteine-added variants at any of the three amino acids that comprise 
the N-linked glycosylation sites. This example also provides cysteine-added variants in the 
region proximal to the A helix, distal to the D helix, in the A-B loop, in the B-C loop, and in 
the C-D loop. Variants in which cysteine residues are added proximal to the first amino acid 
or distal to the final amino acid of the mature protein also are provided. 
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The novel IL-9-derived molecules of this example can be formulated and tested for 
activity essentially as set forth in Examples 1 and 2, substituting, however, the appropriate 
assays and other considerations known in the art related to the specific proteins of this 
example. 

Example 15 
IL-10 

The amino acid sequence of human IL-10 (SEQ ID NO: 16) is given in Vieira et al. 
(1 991) which is incorporated herein by reference. IL-10 is synthesized as a 178 amino acid 
precursor protein that is cleaved to yield a mature protein of 160 amino acids. IL-10 can 
function to activate or suppress the immune system. . The protein shares structural homology 
with the interferons, i.e., it contains five amphipathic helices. The protein contains one N- 
linked glycosylation site. 

This example provides cysteine-added variants at any of the three amino acids 
comprising the N-linked glycosylation site. This example also provides cysteine-added 
variants in the region proximal to the A helix, distal to the E helix, in the A-B loop, in the B- 
C loop, in the C-D loop and in the D-E loop. Variants in which cysteine residues are added 
proximal to the first amino acid or distal to the final amino acid of the mature protein also are 
provided. 

The novel IL-10-derived molecules of this example can be formulated and tested for 
activity essentially as set forth in Examples 1 and 2, substituting, however, the appropriate 
assays and other considerations known in the art related to the specific proteins of this 
example. 

Example 16 
IL-11 

IL-1 1 is a pleiotropic cytokine that stimulates hematopoiesis, lymphopoeisis and acute 
phase responses. IL-1 1 shares many biological effects with IL-6. The amino acid sequence 
of human IL-1 1 (SEQ ID NO: 17) is given in Kawashima et al. (1991) and Paul et al. (1990) 
both incorporated herein by reference. IL-1 1 is synthesized as a precursor protein of 199 
amino acids that is cleaved to yield a mature protein of 178 amino acids. There are no N- 
linked glycosylation sites in the protein. 
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This example provides cysteine-added variants in the region proximal to the A helix, 
distal to the D helix, in the A-B loop, in the B-C loop, and in the C-D loop. Variants in which 
cysteine residues are added proximal to the first amino acid or distal to the final amino acid of 
the mature protein also are provided. 

The novel IL-1 1-derived molecules of this example can be formulated and tested for 
activity essentially as set forth in Examples 1 and 2, substituting, however, the appropriate 
assays and other considerations known in the art related to the specific proteins of this 
example. 

Example 17 
IL-12 p35 

IL-12 stimulates proliferation and differentiation of NK cells and cytotoxic T 
lymphocytes. IL-12existsasaheterodimerofap35subunitandap40subunit. Thep35 
subunit is a member of the GH supergene family. The amino acid sequence of the p35 
subunit (SEQ ID NO: 18) is given in Gubler et al. (1991) and Wolf et al. (1991) both 
incorporated herein by reference. p35 is synthesized as a precursor protein of 1 97 amino 
acids and is cleaved to yield a mature protein of 175 amino acids. The protein contains 7 
cysteine residues and three potential N-linked glycosylation sites. 

This example provides cysteine-added variants at any of the three amino acids that 
comprise the three N-linked glycosylation sites. This example also provides cysteine-added 
variants in the region proximal to the A helix, distal to the D helix, in the A-B loop, in the B- 
C loop, and in the C-D loop. These variants are provided in the context of the natural protein 
sequence or a variant protein in which the naturally occurring "free" cysteine residue has been 
changed to another amino acid, preferably serine or alanine. Variants in which cysteine 
residues are added proximal to the first amino acid or distal to the final amino acid of the 
mature protein also are provided. 

The novel IL-12 p35-derived molecules of this example can be formulated and tested 
for activity essentially as set forth in Examples 1 and 2, substituting, however, the appropriate 
assays and other considerations known in the art related to the specific proteins of this 
example. 
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Example 18 
IL-13 

IL-13 shares many biological properties with IL-4. The amino acid sequence of 
human IL-13 (SEQ ID NO: 19) is given in McKenzie et al. (1993) and Minty et al. (1993) 
both incorporated herein by reference. The protein is synthesized as a 1 32 amino acid 
precursor protein that is cleaved to yield a mature protein of 1 12 amino acids. The mature 
protein contains 5 cysteine residues and multiple N-linked glycosylation sites. A variant in 
which glutamine at position 78 is deleted due to alternative mRNA splicing has been 
described (McKenzie et al 1 993) 

This example provides cysteine-added variants at any of the three amino acids 
comprising the N-linked glycosylation sites. This example also provides cysteine-added 
variants in the region proximal to the A helix, distal to the D helix, in the A-B loop, in the B- 
C loop, and in the C-D loop. These variants are provided in the context of the natural protein 
sequence or a variant sequence in which the pre-existing "free" cysteine has been changed to 
another amino acid, preferably to alanine or serine. Variants in which cysteine residues are 
added proximal to the first amino acid or distal to the final amino acid of the mature protein 
also are provided. 

The novel IL-13-derived molecules of this example can be formulated and tested for 
activity essentially as set forth in Examples 1 and 2, substituting, however, the appropriate 
assays and other considerations known in the art related to the specific proteins of this 
example. 



Example 19 
IL-15 

IL-15 stimulates the proliferation and differentiation of T cells, NK cells, LAK cells 
and Tumor Infiltrating Lymphocytes. IL-1 5 may be useful for treating cancer and viral 
infections. The sequence of IL-15 (SEQ ID NO: 20) is given in Anderson et al. (1995) which 
is incorporated herein by reference. IL-15 contains two N-linked glycosylation sites, which 
are located in the C-D loop and C-terrninal end of the D helix. IL-15 encodes a 162 amino 
acid preprotein that is cleaved to generate a mature 1 14 amino acid protein. 

This example provides cysteine-added variants at any of the three amino acids 
comprising the N-linked glycosylation sites in the C-D loop or C-terrninal end of the D helix. 
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This example also provides cysteine-added variants proximal to helix A, in the A-B loop, the 
B-C loop, the C-D loop or distal to helix D. Variants in which cysteine residues are added 
proximal to the first amino acid or distal to the final amino acid of the mature protein also are 
provided. 

The novel IL-15-derived molecules of this example can be formulated and tested for 
activity essentially as set forth in Examples 1 and 2, substituting, however, the appropriate 
assays and other considerations known in the art related to the specific proteins of this 
example. 



Example 20 

Macrophage Colony Stimulating Factor (M-CSF): 
M-CSF regulates the growth, differentiation and function of monocytes. The protein 
is a disulfide-linked homodimer. Multiple molecular weight species of M-CSF, which arise 
from differential mRNA splicing, have been described. The amino acid sequence of human 
M-CSF and its various processed forms are given in Kawasaki et al (1985), Wong et al. 
(1987) and Cerretti et al (1988) which are incorporated herein by reference. Cysteine-added 
variants can be produced following the general teachings of this application and in 
accordance with the examples set forth herein. 

The novel MCSF-derived molecules of this example can be formulated and tested for 
activity essentially as set forth in Examples 1 and 2, substituting, however, the appropriate 
assays and other considerations known in the art related to the specific proteins of this 
example. 



Example 21 
Oncostatin M: 

Oncostatin M is a multifunctional cytokine that affects the growth and differentiation 
of many cell types. The amino acid sequence of human oncostatin M (SEQ ID NO: 21) is 
given in Malik et al. (1989) which is incorporated herein by reference. Oncostatin M is 
produced by activated monocytes and T lymphocytes. Oncostatin M is synthesized as a 252 
amino acid preprotein that is cleaved sequentially to yield a 227 amino acid protein and then i 
196 amino acid protein (Linsley et al., 1990). The mature protein contains O-linked 
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glycosylation sites and two N-linked glycosylation sites. The protein is O-glycosylated at 
Tl 60, Tl 62 and S 1 65. The mature protein contains five cysteine residues. 

This example provides cysteine-added variants at either of the three amino acids 
comprising the N-linked glycosylation sites or at the amino acids that comprise the O-linked 
glycosylation sites. This example also provides cysteine-added variants proximal to helix A, 
in the A-B loop, in the B-C loop, in the C-D loop or distal to helix D. These variants are 
provided in the context of the natural protein sequence or a variant sequence in which the pre- 
existing "free" cysteine has been changed to another amino acid, preferably to alanine or 
serine. Variants in which cysteine residues are added proximal to the first amino acid or distal 
to the final amino acid of the mature protein also are provided. 

The novel Oncostatin M-derived molecules of this example can be formulated and 
tested for activity essentially as set forth in Examples 1 and 2, substituting, however, the 
appropriate assays and other considerations known in the art related to the specific proteins of 
this example. 

Example 22 
Ciliary Neurotrophic Factor (CNTF): 

The amino acid sequence of human CNTF (SEQ ID NO: 22) is given in Lam et al. 
(1991) which is incorporated herein by reference. CNTF is a 200 amino acid protein that 
contains no glycosylation sites or signal sequence for secretion. The protein contains one 
cysteine residue. CNTF functions as a survival factor for nerve cells. 

This example provides cysteine-added variants proximal to helix A, in the A-B loop, 
the B-C loop, the C-D loop or distal to helix D. These variants are provided in the context of 
the natural protein sequence or a variant sequence in which the pre-existing "free" cysteine 
has been changed to another amino acid, preferably to alanine or serine. Variants in which 
cysteine residues are added proximal to the first amino acid or distal to the final amino acid of 
the mature protein also are provided. 

The novel CNTF-derived molecules of this example can be formulated and tested for 
activity essentially as set forth in Examples 1 and 2, substituting, however, the appropriate 
assays and other considerations known in the art related to the specific proteins of this 
example. 
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Example 23 
Leukemia Inhibitory Factor (LIF) 

The amino acid sequence of LIF (SEQ ID NO: 23) is given in Moreau et al. (1988) 
and Gough et al. (1988) both incorporated herein by reference. The human gene encodes a 
202 amino acid precursor that is cleaved to yield a mature protein of 1 80 amino acids. The 
protein contains six cysteine residues, all of which participate in disulfide bonds. The protein 
contains multiple O- and N-linked glycosylation sites. The crystal structure of the protein 
was determined by Robinson et al. (1994). The protein affects the growth and differentiation 
of many cell types. 

This example provides cysteine-added variants at any of the three amino acids 
comprising the N-linked glycosylation sites or the O-linked glycosylation site. Also provided 
are cysteine-added variants proximal to helix A, in the A-B loop, the B-C loop, the C-D loop 
or distal to helix D. Variants in which cysteine residues are added proximal to the first amino 
acid or distal to the final amino acid of the mature protein also are provided. 

The novel LIF-derived molecules of this example can be formulated and tested for 
activity essentially as set forth in Examples 1 and 2, substituting, however, the appropriate 
assays and other considerations known in the art related to the specific proteins of this 
example. 

All of the documents cited herein are incorporated herein by reference. 

The protein analogues disclosed herein can be used for the known therapeutic uses of 
the native proteins in essentially the same forms and doses all well known in the art. 

While the exemplary preferred embodiments of the present invention are described 
herein with particularity, those having ordinary skill in the art will recognize changes, 
modifications, additions, and applications other than those specifically described herein, and 
may adapt the preferred embodiments and methods without departing from the spirit of this 
invention. 
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What is claimed is: 

1 . A cysteine variant of a member of the GH supergene family comprising a cysteine residue 
substituted; for an amino acid selected from the group consisting of an amino acid in the 
loop regions, an amino acid near the ends of the alpha helices, an amino acid proximal to 
the first amphipathic helix, and an amino acid distal to the final amphipathic helix or 
wherein the cysteine residue is added at the N-terminus or C-terminus of the proteins. 

2. A cysteine variant of a member of the GH supergene family comprising a cysteine residue 
introduced between two amino acids in the loop regions, the ends of the alpha helices, 
proximal to the first amphipathic helix, or distal to the final amphipathic helix. 

3. A cysteine variant according to claim 1 wherein the amino acid substituted for is part of 
an N- or O-linked glycosylation site. 

4. A cysteine variant according to claim 2 wherein the cysteine residue is introduced 
between two amino acids in an N-linked glycosylation site or adjacent to an amino acid in 
an N-linked or O-linked glycosylation site. 

5. A cysteine variant according to claims 1 -4 wherein the member of the GH supergene 
family is selected from the group consisting of growth hormone, prolactin, placental 
lactogen, erythropoietin, thrombopoietin, interleukin-2, interleukin-3, interleukin-4, 
interleukin-5, interleukin-6, interleukin-7, interleukin-9, interleukin-10, interleukin-11, 
interleukin-12 (p35 subunit), interleukin-13, interleukin-15, oncostatin M, ciliary 
neurotrophic factor, leukemia inhibitory factor, alpha interferon, beta interferon, gamma 
interferon, omega interferon, tau interferon, granulocyte-colony stimulating factor, 
granulocyte-macrophage colony stimulating factor, cardiotrophin-1 and macrophage 
colony stimulating factor. 

6. A cysteine variant according to claim 1 wherein the amino acid substituted for is in the A- 
B loop, the B-C loop, the C-D loop or D-E loop of interferon/interferon- 10-like members 
of the GH supergene family. 
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7. A cysteine variant according to claim 2 wherein the loop region is the A-B loop, the B-C 
loop, the C-D loop or D-E loop of interferon/interferon- 10-like members of the GH 
supergene family. 

8. A cysteine variant according to claims 1 -7 wherein the added cysteine is PEGylated. 

9. A cysteine variant according to claim 1 wherein the member of the GH supergene family 
is growth hormone. 

10. The cysteine variant according to claim 9 wherein the substituted for amino acid is 
selected from the group consisting of amino acids located at the N-terminal end of the A- 
B loop, the B-C loop, the C-D loop, the first three or last three amino acids in the A, B, C 
and D helices and the amino acids proximal to helix A and distal to helix D. 

11. The cysteine variant according to claim 10 wherein the substituted for amino acid is 
selected from the group consisting of Fl , T3, P5, E33, A34, K38, E39, Q40, S43, Q46, 
N47, P48, Q49, T50, S51, S55, T60, A98, N99, S100, G104, A105, S106, E129, D130, 
G131, S132, P133, T135, G136, Q137, K140, Q141, T142, S144, K145, D147, T148, 
N149, S150, H151, N152, D153, S184, E186, G187, S188, and G190. 

12. A cysteine variant according to claim 1 wherein the member of the GH supergene family 
is erythropoietin. 

13. The cysteine variant according to claim 12 wherein the substituted for amino acid is 
selected from the group consisting of amino acids located in the A-B loop, the B-C loop, 
the C-D loop, the amino acids proximal to helix A and distal to helix D and the N- or C- 
terminus. 

14. A cysteine variant according to claim 13 wherein the substituted for amino acid is selected 
from the group consisting of serine-126, N24, 125, T26, N38, 139, T40, N83, S84, Al, P2, 
P3, R4, D8, S9, T27, G28, A30, E31, H32, S34, N36, D43, T44, K45, N47, A50, K52, 
E55, G57, Q58, G77, Q78, A79, Q86, W88, E89, T107, Rl 10, Al 1 1, Gl 13, Al 14, Ql 15, 
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K116, El 17, Al 18, S120, P121, P122, D123, A124, A125, A127, A128, T132, K154, 
T157, G158, E159, A160, T163, G164, D165, R166 and S85. 

15. A cysteine variant according to claim 2 wherein the member of the GH supergene family 
5 is growth hormone. 

16. A cysteine variant according to claim 1 5 wherein the cysteine is introduced into the region 
selected from the group consisting of amino acids located at the N-terminal end of the A- 
B loop, the B-C loop, the C-D loop, the first three or last three amino acids in the A, B, C 

10 and D helices and the amino acids proximal to helix A and distal to helix D. 

17. A cysteine variant according to claim 2 wherein the member of the GH supergene family 
is erythropoietin. 

15 18. A cysteine variant according to claim 1 7 wherein the cysteine is introduced into the region 
selected from the group consisting of the N-terminal end of the A-B loop, the B-C loop, 
the C-D loop, the area of the first three or last three amino acids in the A, B, C and D 
helices and proximal to helix A and distal to helix D. 

20 19. A cysteine variant according to claim 2 wherein the member of the GH supergene family 
is alpha interferon. 

20. A cysteine variant according to claim 19 wherein the cysteine is introduced into the region 
selected from the group consisting of the N-terminal end of the A-B loop, the B-C loop, 

25 the C-D loop, the area of the first three or last three amino acids in the A, B, C and D 
helices and proximal to helix A and distal to helix D. 

21. A cysteine variant according to claim 2 wherein the member of the GH supergene family 
is granulocyte-colony stimulating factor. 

30 

22. A cysteine variant according to claim 21 wherein the cysteine is introduced into the region 
selected from the group consisting of the N-terminal end of the A-B loop, the B-C loop, 
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the C-D loop, the area of the first three or last three amino acids in the A, B, C and D 
helices and proximal to helix A and distal to helix D. 

f 

23. A cysteine variant according to claims 8-22 wherein the added cysteine is PEGylated. 
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SEQUENCE LISTING 

<110> Cox III, George N 

Bolder Biotechnology, Inc. 

<120> Derivatives of Growth Hormone and Related Proteins 
<130> BB0011 

<140> unknown 
<141> 1998-07-14 

<150> 60/052,516 
<151> 1997-07-14 

<160> 41 

<170> Patentln Ver. 2.0 

<210> l 
<211> 191 
<212> PRT 

<213> Homo sapiens 
<400> 1 

Phe Pro Thr lie Pro Leu Ser Arg Leu Phe Asp Asn Ala Met Leu Arg 
15 10 15 

Ala His Arg Leu His Gin Leu Ala Phe Asp Thr Tyr Gin Glu Phe Glu 
20 25 30 

Glu Ala Tyr lie Pro Lys Glu Gin Lys Tyr Ser Phe Leu Gin Asn Pro 
35 40 45 

Gin Thr Ser Leu Cys Phe Ser Glu Ser lie Pro Thr Pro Ser Asn Arg 
50 55 60 

Glu Glu Thr Gin Gin Lys Ser Asn Leu Glu Leu Leu Arg lie Ser Leu 
65 70 75 80 

Leu Leu He Gin Ser Trp Leu Glu Pro Val Gin Phe Leu Arg Ser Val- 
85 90 95 

Phe Ala Asn Ser Leu Val Tyr Gly Ala Ser Asp Ser Asn Val Tyr Asp 
100 105 110 

Leu Leu Lys Asp Leu Glu Glu Gly He Gin Thr Leu Met Gly Arg Leu 
115 120 125 

Glu Asp Gly Ser Pro Arg Thr Gly Gin He Phe Lys Gin Thr Tyr Ser 
130 135 140 

Lys Phe Asp Thr Asn Ser His Asn Asp Asp Ala Leu Leu Lys Asn Tyr 
145 150 155 160 

Gly Leu Leu Tyr Cys Phe Arg Lys Asp Met Asp Lys Val Glu Thr Phe 
165 170 175 

Leu Arg He Val Gin Cys Arg Ser Val Glu Gly Ser Cys Gly Phe 
180 185 190 
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<210> 2 
<211> 166 
<212> PRT 

<213> Homo sapiens 
<400> 2 

Ala Pro Pro Arg Leu lie Cys Asp Ser Arg Val Leu Glu Arg Tyr Leu 
15 10 15 

Leu Glu Ala Lys Glu Ala Glu Asn lie Thr Thr Gly Cys Ala Glu His 
20 25 30 

Cys Ser Leu Asn Glu Asn lie Thr Val Pro Asp Thr Lys Val Asn Phe 
35 40 45 

Tyr Ala Trp Lys Arg Met Glu Val Gly Gin Gin Ala Val Glu Val Trp 
50 55 60 

Gin Gly Leu Ala Leu Leu Ser Glu Ala Val Leu Arg Gly Gin Ala Leu 
65 70 75 80 

Leu Val Asn Ser Ser Gin Pro Trp Glu Pro Leu Gin Leu His Val Asp 
85 90 95 

Lys Ala Val Ser Gly Leu Arg Ser Leu Thr Thr Leu Leu Arg Ala Leu 
100 105 110 

Gly Ala Gin Lys Glu Ala lie Ser Pro Pro Asp Ala Ala Ser Ala Ala 
115 120 125 

Pro Leu Arg Thr lie Thr Ala Asp Thr Phe Arg Lys Leu Phe Arg Val 
130 135 140 

Tyr Ser Asn Phe Leu Arg Gly Lys Leu Lys Leu Tyr Thr Gly Glu Ala 
145 150 155 160 

Cys Arg Thr Gly Asp Arg 
165 



<210> 3 
<211> 165 
<212> PRT 

<213> Homo sapiens 
<400> 3 

Cys Asp Leu Pro Gin Thr His Ser Leu Gly Ser Arg Arg Thr Leu Met 
15 10 15 

Leu Leu Ala Gin Met Arg Arg He Ser Leu Phe Ser Cys Leu Lys Asp 
20 25 30 

Arg His Asp Phe Gly Phe Pro Gin Glu Glu Phe Gly Asn Gin Phe Gin 
35 40 45 

Lys Ala Glu Thr He Pro Val Leu His Glu Met He Gin Gin He Phe 
50 55 60 

Asn Leu Phe Ser Thr Lys Asp Ser Ser Ala Ala Trp Asp Glu Thr Leu 
65 70 75 80 
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Leu Asp Lys Phe Tyr Thr Glu Leu Tyr Gin Gin Leu Asn Asp Leu Glu 
85 90 95 

Ala Cys Val lie Gin Gly Val Gly Val Thr Glu Thr Pro Leu Met Lys 
100 105 110 

Glu Asp Ser lie Leu Ala Val Arg Lys Tyr Phe Gin Arg He Thr Leu 
115 120 125 

Tyr Leu Lys Glu Lys Lys Tyr Ser Pro Cys Ala Trp Glu Val Val Arg 
130 135 140 

Ala Glu He Met Arg Ser Phe Ser Leu Ser Thr Asn Leu Gin Glu Ser 
145 150 155 160 

Leu Arg Ser Lys Glu 
165 



<210> 4 
<211> 166 
<212> PRT 

<213> Homo sapiens 
<400> 4 

Cys Asp Leu Pro Glu Thr His Ser Leu Asp Asn Arg Arg Thr Leu Met 
15 10 15 

Leu Leu Ala Gin Met Ser Arg He Ser Pro Ser Ser Cys Leu Met Asp 
20 25 30 

Arg His Asp Phe Gly Phe Pro Gin Glu Glu Phe Asp Gly Asn Gin Phe 
35 40 45 

Gin Lys Ala Pro Ala He Ser Val Leu His Glu Leu He Gin Gin He 
50 55 60 

Phe Asn Leu Phe Thr Thr Lys Asp Ser Ser Ala Ala Trp Asp Glu Asp 
65 70 75 80 

Leu Leu Asp Lys Phe Cys Thr Glu Leu Tyr Gin Gin Leu Asn Asp Leu 
85 90 95 

Glu Ala Cys Val Met Gin Glu Glu Arg Val Gly Glu Thr Pro Leu Met 
100 105 110 

Asn Ala Asp Ser He Leu Ala Val Lys Lys Tyr Phe Arg Arg He Thr 
115 120 125 

Leu Tyr Leu Thr Glu Lys Lys Tyr Ser Pro Cys Ala Trp Glu Val Val 
130 135 140 

Arg Ala Glu He Met Arg Ser Leu Ser Leu Ser Thr Asn Leu Gin Glu 
145 150 155 160 

Arg Leu Arg Arg Lys Glu 
165 



<210> 5 
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<211> 166 
<212> PRT 

<213> Homo sapiens 
<400> 5 i 

Met Ser Tyr Asn Leu Leu Gly Phe Leu Gin Arg Ser Ser Asn Phe Gin 
15 10 15 

Cys Gin Lys Leu Leu Tip Gin Leu Asn Gly Arg Leu Glu Tyr Cys Leu 
20 25 30 

Lys Asp Arg Met Asn Phe Asp lie Pro Glu Glu lie Lys Gin Leu Gin 
35 40 45 

Gin Phe Gin Lys Glu Asp Ala Ala Leu Thr lie Tyr Glu Met Leu Gin 
50 55 60 

Asn lie Phe Ala He Phe Arg Gin Asp Ser Ser Ser Thr Gly Trp Asn 
65 70 75 80 

Glu Thr He Val Glu Asn Leu Leu Ala Asn Val Tyr His Gin He Asn 
85 90 95 

His Leu Lys Thr Val Leu Glu Glu Lys Leu Glu Lys Glu Asp Phe Thr 
100 105 110 

Arg Gly Lys Leu Met Ser Ser Leu His Leu Lys Arg Tyr Tyr Gly Arg 
115 120 125 

He Leu His Tyr Leu Lys Ala Lys Glu Tyr Ser His Cys Ala Trp Thr 
130 135 140 

He Val Arg Val Glu He Leu Arg Asn Phe Tyr Phe He Asn Arg Leu 
145 150 155 160 

Thr Gly Tyr Leu Arg Asn 
165 



<210> 6 

<211> 174 

<212> PRT 

<213> Homo sapiens 

<400> 6 

Thr Pro Leu Gly Pro Ala Ser Ser Leu Pro Gin Ser Phe Leu Leu Lys 
15 10 15 

Cys Leu Glu Gin Val Arg Lys He Gin Gly Asp Gly Ala Ala Leu Gin 
20 25 30 

Glu Lys Leu Cys Ala Thr Tyr Lys Leu Cys His Pro Glu Glu Leu Val 
35 40 45 

Leu Leu Gly His Ser Leu Gly He Pro Trp Ala Pro Leu Ser Ser Cys 
50 55 60 

Pro Ser Gin Ala Leu Gin Leu Ala Gly Cys Leu Ser Gin Leu His Ser 
65 70 75 80 
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Gly Leu Phe Leu Tyr Gin Gly Leu Leu Gin Ala Leu Glu Gly lie Ser 
85 90 95 

Pro Glu Leu Gly Pro Thr Leu Asp Thr Leu Gin Leu Asp Val Ala Asp 
100 105 110 

Phe Ala Thr Thr He Trp Gin Gin Met Glu Glu Leu Gly Met Ala Pro 
115 120 125 

Ala Leu Gin Pro Thr Gin Gly Ala Met Pro Ala Phe Ala Ser Ala Phe 
130 135 140 

Gin Arg Arg Ala Gly Gly Val Leu Val Ala Ser His Leu Gin Ser Phe 
145 150 155 160 



Leu Glu Val Ser Tyr Arg Val Leu Arg His Leu Ala Gin Pro 
165 170 



<210> 7 
<211> 332 
<212> PRT 

<213> Homo sapiens 
<400> 7 

Ser Pro Ala Pro Pro Ala Cys Asp Leu Arg Val Leu Ser Lys Leu Leu 
15 10 15 

Arg Asp Ser His Val Leu His Ser Arg Leu Ser Gin Cys Pro Glu Val 
20 25 30 

His Pro Leu Pro Thr Pro Val Leu Leu Pro Ala Val Asp Phe Ser Leu 
35 40 45 

Gly Glu Trp Lys Thr Gin Met Glu Glu Thr Lys Ala Gin Asp He Leu 
50 55 60 

Gly Ala Val Thr Leu Leu Leu Glu Gly Val Met Ala Ala Arg Gly Gin 
65 70 75 . 80 

Leu Gly Pro Thr Cys Leu Ser Ser Leu Leu Gly Gin Leu Ser Gly Gin 
85 90 95 

Val Arg Leu Leu Leu Gly Ala Leu Gin Ser Leu Leu Gly Thr Gin Leu 
100 105 110 

Pro Pro Gin Gly Arg Thr Thr Ala His Lys Asp Pro Asn Ala He Phe 
115 120 125 

Leu Ser Phe Gin His Leu Leu Arg Gly Lys Val Arg Phe Leu Met Leu 
130 135 140 

Val Gly Gly Ser Thr Leu Cys Val Arg Arg Ala Pro Pro Thr Thr Ala 
145 150 155 160 

Val Pro Ser Arg Thr Ser Leu Val Leu Thr Leu Asn Glu Leu Pro Asn 
165 170 175 

Arg Thr Ser Gly Leu L u Glu Thr Asn Phe Thr Ala Ser Ala Arg Thr 
180 185 190 
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Thr Gly Ser Gly Leu Leu Lys Trp Gin Gin Gly Phe Arg Ala Lys lie 
195 200 205 

Pro Gly Leu Leu Asn Gin Thr Ser Arg Ser Leu Asp Gin lie Pro Gly 
210 215 220 

Tyr Leu Asn Arg lie His Glu Leu Leu Asn Gly Thr Arg Gly Leu Phe 
225 230 235 240 

Pro Gly Pro Ser Arg Arg Thr Leu Gly Ala Pro Asp lie Ser Ser Gly 
245 250 255 

Thr Ser Asp Thr Gly Ser Leu Pro Pro Asn Leu Gin Pro Gly Tyr Ser 
260 265 270 

Pro Ser Pro Thr His Pro Pro Thr Gly Gly Tyr Thr Leu Phe Pro Leu 
275 280 265 

Pro Pro Thr Leu Pro Thr Pro Val Val Gin Leu His Pro Leu Leu Pro 
290 295 300 

Asp Pro Ser Ala Pro Thr Pro Thr Pro Thr Ser Pro Leu Leu Asn Thr 
305 310 315 320 

Ser Tyr Thr His Ser Gin Asn Leu Ser Gin Glu Gly 



<210> 8 
<211> 127 
<212> PRT 

<213> Homo sapiens 
<400> 8 

Ala Pro Ala Arg Ser Pro Ser Pro Ser Thr Gin Pro Trp Glu His Val 
1 5 10' 15 

Asn Ala lie Gin Glu Ala Arg Arg Leu Leu Asn Leu Ser Arg Asp Thr 
20 25 30 

Ala Ala Glu Met Asn Glu Thr Val Glu Val lie Ser Glu Met Phe Asp 
35 40 45 

Leu Gin Glu Pro Thr Cys Leu Gin Thr Arg Leu Glu Leu Tyr Lys Gin 
50 55 60 

Gly Leu Arg Gly Ser Leu Thr Lys Leu Lys Gly Pro Leu Thr Met Met 
65 70 75 80 

Ala Ser His Tyr Lys Gin His Cys Pro Pro Thr Pro Glu Thr Ser Cys 
85 90 95 

Ala Thr Gin He He Thr Phe Glu Ser Phe Lys Glu Asn Leu Lys Asp 
100 105 110 

Phe Leu Leu Val He Pro Phe Asp Cys Trp Glu Pro Val Gin Glu 



325 



330 



115 



120 



125 



<210> 9 
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<211> 133 
<212> PRT 

<213> Homo sapiens 



<400> 9 
Ala Pro 
1 



Thr 



Ser 



Ser Ser Thr Lys Lys Thr Gin Leu Gin Leu Glu His 
5 10 15 



Leu Leu Leu Asp Leu Gin Met lie Leu Asn Gly lie Asn Asn Tyr Lys 
20 25 30 

Asn Pro Lys Leu Thr Arg Met Leu Thr Phe Lys Phe Tyr Met Pro Lys 
35 40 45 

Lys Ala Thr Glu Leu Lys His Leu Gin Cys Leu Glu Glu Glu Leu Lys 
50 55 60 

Pro Leu Glu Glu Val Leu Asn Leu Ala Gin Ser Lys Asn Phe His Leu 
65 70 75 80 

Arg Pro Arg Asp Leu lie Ser Asn lie Asn Val lie Val Leu Glu Leu 
85 90 95 

Lys Gly Ser Glu Thr Thr Phe Met Cys Glu Tyr Ala Asp Glu Thr Ala 
100 105 110 

Thr lie Val Glu Phe Leu Asn Arg Trp lie Thr Phe Cys Gin Ser lie 
115 120 125 

lie Ser Thr Leu Thr 
130 



<210> 10 
<211> 152 
<212> PRT 

<213> Homo sapiens 
<400> 10 

Met Ser Arg Leu Pro Val Leu Leu Leu Leu Gin Leu Leu Val Arg Pro 
1 5 10 15 

Gly Leu Gin Ala Pro Met Thr Gin Thr Thr Pro Leu Lys Thr Ser Trp 
20 25 30 

Val Asn Cys Ser Asn Met lie Asp Glu lie lie Thr His Leu Lys Gin 
35 40 45 

Pro Pro Leu Pro Leu Leu Asp Phe Asn Asn Leu Asn Gly Glu Asp Gin 
50 55 60 

Asp He Leu Met Glu Asn Asn Leu Arg Arg Pro Asn Leu Glu Ala Phe 
65 70 75 80 

Asn Arg Ala Val Lys Ser Leu Gin Asn Ala Ser Ala He Glu Ser He 
85 90 95 

Leu Lys Asn Leu Leu Pro Cys Leu Pro Leu Ala Thr Ala Ala Pro Thr 
100 105 110 

Arg His Pro He His He Lys Asp Gly Asp Trp Asn Glu Phe Arg Arg 
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115 120 125 

Lys Leu Thr Phe Tyr Leu Lys Thr Leu Glu Asn Ala Gin Ala Gin Gin 
130 135 140 

Thr Thr Leu Ser Leu Ala lie Phe 
145 150 



<210> 11 
<211> 129 
<212> PRT 

<213> Homo sapiens 
<400> 11 

His Lys Cys Asp lie Thr Leu Gin Glu lie lie Lys Thr Leu Asn Ser 
15 10 15 

Leu Thr Glu Gin Lys Thr Leu Cys Thr Glu Leu Thr Val Thr Asp lie 
20 25 30 

Phe Ala Ala Ser Lys Asn Thr Thr Glu Lys Glu Thr Phe Cys Arg Ala 
35 40 45 

Ala Thr Val Leu Arg Gin Phe Tyr Ser His His Glu Lys Asp Thr Arg 
50 55 60 

Cys Leu Gly Ala Thr Ala Gin Gin Phe His Arg His Lys Gin Leu He 
65 70 75 80 

Arg Phe Leu Lys Arg Leu Asp Arg Asn Leu Trp Gly Leu Ala Gly Leu 
85 90 95 

Asn Ser Cys Pro Val Lys Glu Ala Asn Gin Ser Thr Leu Glu Asn Phe 
100 105 110 

Leu Glu Arg Leu Lys Thr He Met Arg Glu Lys Tyr Ser Lys Cys Ser 
115 120 125 

Ser 



<210> 12 
<211> 134 
<212> PRT 

<213> Homo sapiens 
<400> 12 

Met Arg Met Leu Leu His Leu Ser Leu Leu Ala Leu Gly Ala Ala Tyr 
15 10 15 

Val Tyr Ala He Pro Thr Glu He Pro Thr Ser Ala Leu Val Lys Glu 
20 25 30 

Thr Leu Ala Leu Leu Ser Thr His Arg Thr Leu Leu He Ala Asn Glu 
35 40 45 

Thr Leu Arg He Pro Val Pro Val His Lys Asn His Gin Leu Cys Thr 
50 55 60 
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Glu Glu He Phe Gin Gly He Gly 
65 70 

Gly Gly Thr Val Glu Arg Leu Phe 
85 

Tyr He Asp Gly Gin Lys Lys Lys 
100 

Asn Gin Phe Leu Asp Tyr Leu Gin 
115 120 

Glu Trp He He Glu Ser 
130 



Thr Leu Glu Ser Gin Thr Val Gin 
75 80 

Lys Asn Leu Ser Leu He Lys Lys 
90 95 

Cys Gly Glu Glu Arg Arg Arg Val 
105 110 

Glu Phe Leu Gly Val Met Asn Thr 
125 



<210> 13 

<211> 212 

<212> PRT 

<213> Homo sapiens 

<400> 13 

Met Asn Ser Phe Ser Thr Ser Ala Phe Gly Pro Val Ala Phe Ser Leu 
15 10 15 

Gly Leu Leu Leu Val Leu Pro Ala Ala Phe Pro Ala Pro Val Pro Pro 
20 25 30 

Gly Glu Asp Ser Lys Asp Val Ala Ala Pro His Arg Gin Pro Leu Thr 
35 40 45 

Ser Ser Glu Arg He Asp Lys Gin He Arg Tyr He Leu Asp Gly He 
50 55 60 

Ser Ala Leu Arg Lys Glu Thr Cys Asn Lys Ser Asn Met Cys Glu Ser 
65 70 75 80 

Ser Lys Glu Ala Leu Ala Glu Asn Asn Leu Asn Leu Pro Lys Met Ala 
85 90 95 

Glu Lys Asp Gly Cys Phe Gin Ser Gly Phe Asn Glu Glu Thr Cys Leu 
100 105 110 

Val Lys lie He Thr Gly Leu Leu Glu Phe Glu Val Tyr Leu Glu Tyr 
115 120 125 

Leu Gin Asn Arg Phe Glu Ser Ser Glu Glu Gin Ala Arg Ala Val Gin 
130 135 140 

Met Ser Thr Lys Val Leu He Gin Phe Leu Gin Lys Lys Ala Lys Asn 
145 150 155 160 

Leu Asp Ala He Thr Thr Pro Asp Pro Thr Thr Asn Ala Ser Leu Leu 
165 170 175 

Thr Lys Leu Gin Ala Gin Asn Gin Trp Leu Gin Asp Met Thr Thr His 
180 185 190 
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Leu lie Leu Arg Ser Phe Lys Glu Phe Leu Gin Ser Ser Leu Arg Ala 
195 200 205 

Leu Arg Gin Met 

210 , 



<210> 14 
<211> 177 
<212> PRT 

<213> Homo sapiens 
<400> 14 

Met Phe His Val Ser Phe Arg Tyr lie Phe Gly Leu Pro Pro Leu lie 
1 5 10 15 

Leu Val Leu Leu Pro Val Ala Ser Ser Asp Cys Asp lie Glu Gly Lys 
20 25 30 

Asp Gly Lys Gin Tyr Glu Ser Val Leu Met Val Ser lie Asp Gin Leu 
35 40 45 

Leu Asp Ser Met Lys Glu lie Gly Ser Asn Cys Leu Asn Asn Glu Phe 
50 55 60 

Asn Phe Phe Lys Arg His lie Cys Asp Ala Asn Lys Glu Gly Met Phe 
65 70 75 80 

Leu Phe Arg Ala Ala Arg Lys Leu Arg Gin Phe Leu Lys Met Asn Ser 
85 90 95 

Thr Gly Asp Phe Asp Leu His Leu Leu Lys Val Ser Glu Gly Thr Thr 
100 105 110 

lie Leu Leu Asn Cys Thr Gly Gin Val Lys Gly Arg Lys Pro Ala Ala 
115 120 125 

Leu Gly Glu Ala Gin Pro Thr Lys Ser Leu Glu Glu Asn Lys Ser Leu 
130 135 140 

Lys Glu Gin Lys Lys Leu Asn Asp Leu Cys Phe Leu Lys Arg Leu Leu 
145 150 155 160 

Gin Glu He Lys Thr Cys Trp Asn Lys lie Leu Met Gly Thr Lys Glu 
165 170 175 

His 



<210> 15 
<211> 144 
<212> PRT 

<213> Homo sapiens 
<400> 15 

Met Leu Leu Ala Met Val Leu Thr Ser Ala Leu Leu Leu Cys Ser Val 
15 10 15 

Ala Gly Gin Gly Cys Pro Thr Leu Ala Gly He Leu Asp He Asn Phe 
20 25 30 
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Leu lie Asn Lys Met Gin Glu Asp Pro Ala Ser Lys Cys His Cys Ser 
35 40 45 

Ala Asn Val Thr Ser Cys Leu Cys Leu Gly lie Pro Ser Asp Asn Cys 
50 55 60 

Thr Arg Pro Cys Phe Ser Glu Arg Leu Ser Gin Met Thr Asn Thr Thr 
65 70 75 80 

Met Gin Thr Arg Tyr Pro Leu lie Phe Ser Arg Val Lys Lys Ser Val 
85 90 95 

Glu Val Leu Lys Asn Asn Lys Cys Pro Tyr Phe Ser Cys Glu Gin Pro 
100 105 110 

Cys Asn Gin Thr Thr Ala Gly Asn Ala Leu Thr Phe Leu Lys Ser Leu 
115 120 125 



Leu Glu lie Phe Gin Lys Glu Lys Met Arg Gly Met Arg Gly Lys lie 
130 135 ' 140 



<210> 16 
<211> 178 
<212> PRT 

<213> Homo sapiens 
<400> 16 

Met His Ser Ser Ala Leu Leu Cys Cys Leu Val Leu Leu Thr Gly Val 
1 5 10. 15 

Arg Ala Ser Pro Gly Gin Gly Thr Gin Ser Glu Asn Ser Cys Thr His 
20 25 30 

Phe Pro Gly Asn Leu Pro Asn Met Leu Arg Asp Leu Arg Asp Ala Phe 
35 40 45 

Ser Arg Val Lys Thr Phe Phe Gin Met Lys Asp Gin Leu Asp Asn Leu 
50 55 60 

Leu Leu Lys Glu Ser Leu Leu Glu Asp Phe Lys Gly Tyr Leu Gly Cys 
65 70 75 80 

Gin Ala Leu Ser Glu Met lie Gin Phe Tyr Leu Glu Glu Val Met Pro 
B5 90 95 

Gin Ala Glu Asn Gin Asp Pro Asp He Lys Ala His Val Asn Ser Leu 
100 105 110 

Gly Glu Asn Leu Lys Thr Leu Arg Leu Arg Leu Arg Arg Cys His Arg 
115 120 125 

Phe Leu Pro Cys Glu Asn Lys Ser Lys Ala Val Glu Gin Val Lys Asn 
130 135 140 

Ala Phe Asn Lys Leu Gin Glu Lys Gly He Tyr Lys Ala Met Ser Glu 
150 155 160 
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Phe Asp lie Phe lie Asn Tyr lie Glu Ala Tyr Met Thr Met Lys lie 
165 170 175 

Arg Asn 



<210> 17 
<211> 199 
<212> PRT 

<213> Homo sapiens 
<400> 17 

Met Asn Cys Val Cys Arg Leu Val Leu Val Val Leu Ser Leu Trp Pro 
15 10 15 

Asp Thr Ala Val Ala Pro Gly Pro Pro Pro Gly Pro Pro Arg Val Ser 
20 25 30 

Pro Asp Pro Arg Ala Glu Leu Asp Ser Thr Val Leu Leu Thr Arg Ser 
35 40 45 

Leu Leu Ala Asp Thr Arg Gin Leu Ala Ala Gin Leu Arg Asp Lys Phe 
50 55 60 

Pro Ala Asp Gly Asp His Asn Leu Asp Ser Leu Pro Thr Leu Ala Met 
65 70 75 80 

Ser Ala Gly Ala Leu Gly Ala Leu Gin Leu Pro Gly Val Leu Thr Arg 
85 90 95 

Leu Arg Ala Asp Leu Leu Ser Tyr Leu Arg His Val Gin Trp Leu Arg 
100 105 110 

Arg Ala Gly Gly Ser Ser Leu Lys Thr Leu Glu Pro Glu Leu Gly Thr 
115 120 125 

Leu Gin Ala Arg Leu Asp Arg Leu Leu Arg Arg Leu Gin Leu Leu Met 
130 135 140 

Ser Arg Leu Ala Leu Pro Gin Pro Pro Pro Asp Pro Pro Ala Pro Pro 
145 150 155 160 

Leu Ala Pro Pro Ser Ser Ala Trp Gly Gly lie Arg Ala Ala His Ala 
165 170 175 

lie Leu Gly Gly Leu His Leu Thr Leu Asp Trp Ala Val Arg Gly Leu 
180 185 190 

Leu Leu Leu Lys Thr Arg Leu 
195 



<210> 18 
<211> 219 
<212> PRT 

<213> Homo sapiens 
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<400> 18 

Met Cys Pro Ala Arg Ser Leu Leu Leu Val Ala Thr Leu Val Leu Leu 
15 10 15 

Asp His Leu Ser Leu Ala Arg Asn Leu Pro Val Ala Thr Pro Asp Pro 
20 25 30 

Gly Met Phe Pro Cys Leu His His Ser Gin Asn Leu Leu Arg Ala Val 
35 40 45 

Ser Asn Met Leu Gin Lys Ala Arg Gin Thr Leu Glu Phe Tyr Pro Cys 
50 55 60 

Thr Ser Glu Glu lie Asp His Glu Asp He Thr Lys Asp Lys Thr Ser 
65 70 75 80 

Thr Val Glu Ala Cys Leu Pro Leu Glu Leu Thr Lys Asn Glu Ser Cys 
85 90 95 

Leu Asn Ser Arg Glu Thr Ser Phe He Thr Asn Gly Ser Cys Leu Ala 
100 105 110 

Ser Arg Lys Thr Ser Phe Met Met Ala Leu Cys Leu Ser Ser He Tyr 
115 120 125 

Glu Asp Leu Lys Met Tyr Gin Val Glu Phe Lys Thr Met Asn Ala Lys 
130 135 140 

Leu Leu Met Asp Pro Lys Arg Gin He Phe Leu Asp Gin Asn Met Leu 
145 150 155 160 

Ala Val lie Asp Glu Leu Met Gin Ala Leu Asn Phe Asn Ser Glu Thr 
165 170 175 

Val Pro Gin Lys Ser Ser Leu Glu Glu Pro Asp Phe Tyr Lys Thr Lys 
180 185 190 

He Lys Leu Cys He Leu Leu His Ala Phe Arg He Arg Ala Val Thr 
195 200 205 

He Asp Arg Val Thr Ser Tyr Leu Asn Ala Ser 
210 215 



<210> 19 
<211> 132 
<212> PRT 

<213> Homo sapiens 
<400> 19 

Met Ala Leu Leu Leu Thr Thr Val He Ala Leu Thr Cys Leu Gly Gly 
1 5 10 15 

Phe Ala Ser Pro Gly Pro Val Pro Pro Ser Thr Ala Leu Arg Glu Leu 
20 25 30 

He Glu Glu Leu Val Asn He Thr Gin Asn Gin Lys Ala Pro Leu Cys 
35 40 45 

Asn Gly Ser Met Val Trp Ser He Asn Leu Thr Ala Gly Met Tyr Cys 
50 55 60 
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Ala Ala Leu Glu Ser Leu lie Asn Val Ser Gly Cys Ser Ala lie Glu 
65 70 75 80 



Lys Thr Gin Arg Met Leu Ser Gly Phe Cys Pro His Lys Val Ser Ala 
85 90 95 

Gly Gin Phe Ser Ser Leu His Val Arg Asp Thr Lys He Glu Val Ala 
100 105 110 

Gin Phe Val Lys Asp Leu Leu Leu His Leu Lys Lys Leu Phe Arg Glu 
115 120 125 

Gly Arg Phe Asn 
130 



<210> 20 
<211> 114 
<212> PRT 

<213> Homo sapiens 
<400> 20 

Asn Trp Val Asn Val He Ser Asp Leu Lys Lys He Glu Asp Leu He 
15 10 15 

Gin Ser Met His He Asp Ala Thr Leu Tyr Thr Glu Ser Asp Val His 
20 25 30 

Pro Ser Cys Lys Val Thr Ala Met Lys Cys Phe Leu Leu Glu Leu Gin 
35 40 45 

Val He Ser Leu Glu Ser Gly Asp Ala Ser He His Asp Thr Val Glu 
50 55 60 

Asn Leu He He Leu Ala Asn Asn Ser Leu Ser Ser Asn Gly Asn Val 
65 70 75 80 

Thr Glu Ser Gly Cys Lys Glu Cys Glu Glu Leu Glu Glu Lys Asn He 
85 90 95 

Lys Glu Phe Leu Gin Ser Phe Val His He Val Gin Met Phe He Asn 
100 105 110 

Thr Ser 



<210> 21 
<211> 252 
<212> PRT 

<213> Homo sapiens 
<400> 21 

Met Gly Val Leu Leu Thr Gin Arg Thr Leu Leu Ser Leu Val Leu Ala 
1 5 10 15 

Leu Leu Phe Pro Ser Met Ala Ser Met Ala Ala He Gly Ser Cys Ser 
20 25 30 
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Lys Glu Tyr Arg Val Leu Leu Gly Gin Leu Gin Lys Gin Thr Asp Leu 
35 40 45 

Met Gin Asp Thr Ser Arg Leu Leu Asp Pro Tyr He Arg He Gin Gly 
50 55 60 

Leu Asp Val Pro Lys Leu Arg Glu His Cys Arg Glu Arg Pro Gly Ala 
65 70 75 80 

Phe Pro Ser Glu Glu Thr Leu Arg Gly Leu Gly Arg Arg Gly Phe Leu 
85 90 95 

Gin Thr Leu Asn Ala Thr Leu Gly Cys Val Leu His Arg Leu Ala Asp 
100 105 110 

Leu Glu Gin Arg Leu Pro Lys Ala Gin Asp Leu Glu Arg Ser Gly Leu 
115 120 125 

Asn He Glu Asp Leu Glu Lys Leu Gin Met Ala Arg Pro Asn He Leu 
130 135 140 

Gly Leu Arg Asn Asn He Tyr Cys Met Ala Gin Leu Leu Asp Asn Ser 
145 150 155 160 

Asp Thr Ala Glu Pro Thr Lys Ala Gly Arg Gly Ala Ser Gin Pro Pro 
165 170 175 

Thr Pro Thr Pro Ala Ser Asp Ala Phe Gin Arg Lys Leu Glu Gly Cys 
180 185 190 

Arg Phe Leu His Gly Tyr His Arg Phe Met His Ser Val Gly Arg Val 
195 200 205 

Phe Ser Lys Trp Gly Glu Ser Pro Asn Arg Ser Arg Arg His Ser Pro 
210 215 220 

His Gin Ala Leu Arg Lys Gly Val Arg Arg Thr Arg Pro Ser Arg Lys 
225 230 235 240 



Gly Lys Arg Leu Met Thr Arg Gly Gin Leu Pro Arg 
245 250 



<210> 22 
<211> 200 
<212> PRT 

<213> Homo sapiens 
<400> 22 

Met Ala Phe Thr Glu His Ser Pro Leu Thr Pro His Arg Arg Asp Leu 
15 io 15 

Cys Ser Arg Ser He Trp Leu Ala Arg Lys He Arg Ser Asp Leu Thr 
20 25 30 

Ala Leu Thr Glu Ser Tyr Val Lys His Gin Gly Leu Asn Lys Asn He 
35 40 45 

Asn Leu Asp Ser Ala Asp Gly Met Pro Val Ala Ser Thr Asp Gin Trp 
50 55 60 
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Ser Glu Leu Thr Glu Ala Glu Arg Leu Gin Glu Asn Leu Gin Ala Tyr 
65 70 75 80 

Arg Thr Phe His Val Leu Leu Ala Arg Leu Leu Glu Asp Gin Gin Val 
85 90 95 

His Phe Thr Pro Thr Glu Gly Asp Phe His Gin Ala lie His Thr Leu 
100 105 110 

Leu Leu Gin Val Ala Ala Phe Ala Tyr Gin He Glu Glu Leu Met He 
115 120 125 

Leu Leu Glu Tyr Lys He Pro Arg Asn Glu Ala Asp Gly Met Pro He 
130 135 140 

Asn Val Gly Asp Gly Gly Leu Phe Glu Lys Lys Leu Trp Gly Leu Lys 
145 150 155 160 

Val Leu Gin Glu Leu Ser Gin Trp Thr Val Arg Ser He His Asp Leu 
165 170 175 

Arg Phe He Ser Ser His Gin Thr Gly He Pro Ala Arg Gly Ser His 
180 185 190 

Tyr He Ala Asn Asn Lys Lys Met 



<210> 23 

<211> 181 

<212> PRT 

<213> Homo sapiens 

<400> 23 

Ser Pro Leu Pro He Thr Pro Val Asn Ala Thr Cys Ala He Arg His 
1 5 10 15 

Pro Cys His Asn Asn Leu Met Asn Gin He Arg Ser Gin Leu Ala Gin 
20 25 30 

Leu Asn Gly Ser Ala Asn Ala Leu Phe He Leu Tyr Tyr Thr Ala Gin 
35 40 45 

Gly Glu Pro Phe Pro Asn Asn Leu Asp Lys Leu Cys Gly Pro Asn Val 
50 55 60 

Thr Asp Phe Pro Pro Phe His Ala Asn Gly Thr Glu Lys Ala Lys Leu 
65 70 75 80 

Val Glu Leu Tyr Arg He Val Val Tyr Leu Gly Thr Ser Leu Gly Asn 
85 90 95 

He Thr Arg Asp Gin Lys He Leu Asn Pro Ser Ala Leu Ser Leu His 
100 105 HO 

Ser Lys Leu Asn Ala Thr Ala Asp He Leu Arg Gly Leu Leu Ser Asn 



195 



200 



115 



120 



125 



Val Leu Cys 



Arg 



Leu 



Cys Ser Lys 



Tyr His Val Gly His Val Asp Val 
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130 135 140 

Thr Tyr Gly Pro Pro Asp Thr Ser Gly Lys Asp Val Phe Gin Lys Lys 
145 150 155 160 

/ 

Lys Leu Gly Cys Gin Leu Leu Gly Lys Tyr Lys Gin lie lie Ala Val 
165 170 175 

Leu Ala Gin Ala Phe 
180 



<210> 24 
<211> 39 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence :PCR Primer 
<400> 24 

gggggtcgac catatgttcc caaccattcc cttatccag 39 

<210> 25 
<211> 33 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence :PCR Primer 
<400> 25 

gggggatcct cactagaagc cacagctgcc etc 33 

<210> 26 
<211> 39 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence :PCR Primer 
<400> 26 

ccccggatcc gccaccatgg atctctggca gctgctgtt 39 

<210> 27 
<211> 40 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence :PCR Primer 
<400> 27 

ccccgtcgac tctagagcta ttaaatacgt agctcttggg 40 

<210> 28 
<211> 32 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence: PGR Primer 
<400> 28 

cgcggatccg at ( tagaatcc acagctcccc tc 32 

<210> 29 
<211> 66 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence :PCR Primer 



<400> 29 

ccccctctag acatatgaag aagaacatcg cattcctgct ggcatctatg ttcgttttct 60 
ctatcg 66 



<210> 30 
<211> 65 
<212> DNA 
<213> Artificial 



Sequence 



<220> 

<223> Description of Artificial Sequence :PCR Primer 
<400> 30 

gcatctatgt tcgttttctc tatcgctacc aacgcttacg cattcccaac cattccctta 60 
tccag 65 



<210> 31 
<211> 62 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence :PCR Primer 
<400> 31 

gcagtggcac tggctggttt cgctaccgta gcgcaggcct tcccaaccat tcccttatcc 60 
ag 62 



<210> 32 
<211> 59 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: PGR Primer 
<400> 32 

ccccgtcgac acatatgaag aagacagcta tcgcgattgc agtggcactg gctggtttc 59 

<210> 33 
<211> 36 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: PGR Primer 
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<400> 33 

ctgcttgaag atctgcccac accgggggct gccatc 



36 



<210> 34 
<211> 24 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence :PCR Primer 
<400> 34 

gtagcgcagg ccttcccaac catt 24 

<210> 35 
<211> 39 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence :PCR Primer 



<210> 36 
<211> 51 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence :PCR Primer 
<400> 36 

gggcagatct tcaagcagac ctacagcaag ttcgactgca actcacacaa c 51 

<210> 37 
<211> 34 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence :PCR Primer 



<210> 38 
<211> 36 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence :PCR Primer 
<400> 3B 

gggcagatct tcaagcagac ctactgcaag ttcgac 

<210> 39 
<211> 42 
<212> DNA 



<400> 35 

ctgcttgaag atctgcccag tccgggggca gccatcttc 



39 



<400> 37 

cgcggtaccc gggatccgat tagaatccac agct 



34 
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